Polynucleotides and polypeptides in plants

ABSTRACT

The invention relates to plant transcription factor polypeptides, polynucleotides that encode them, homologs from a variety of plant species, and methods of using the polynucleotides and polypeptides to produce transgenic plants having advantageous properties compared to a reference plant. Sequence information related to these polynucleotides and polypeptides can also be used in bioinformatic search methods is also disclosed.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a U.S. NATIONAL PHASE application of InternationalApplication No. PCT/US2004/005654, filed Feb. 25, 2004, which claims thebenefit of U.S. Non-provisional application Ser. No. 10/374,780, filedFeb. 25, 2003, and U.S. Non-provisional application Ser. No. 10/675,852,filed Sep. 30, 2003. This application is also a continuation-in-part ofU.S. Non-provisional application Ser. No. 10/374,780, filed Feb. 25,2003. This application is also a continuation-in-part of U.S.Non-provisional application Ser. No. 10/675,852, filed Sep. 30, 2003.All of which are hereby incorporated by reference in their entirety.

DISCLOSURE OF PARTIES TO A JOINT RESEARCH AGREEMENT

The claimed invention, in the field of functional genomics and thecharacterization of plant genes for the improvement of plants, was madeby or on behalf of Mendel Biotechnology, Inc. and Monsanto Corporationas a result of activities undertaken within the scope of a jointresearch agreement, said agreement having been executed on Oct. 31,1997, and in effect on or before the date the claimed invention wasmade.

RELATIONSHIP TO COPENDING APPLICATIONS

This application claims the benefit of U.S. Non-provisional applicationSer. No. 10/374,780, filed 25 Feb. 2003, and U.S. Non-provisionalapplication Ser. No. 10/675,852, filed 30 Sep. 2003, the contents ofwhich are hereby incorporated by reference in their entirety.

TECHNICAL FIELD

This invention relates to the field of plant biology. More particularly,the present invention pertains to compositions and methods for modifyinga plant phenotypically.

BACKGROUND OF THE INVENTION

A plant's traits, such as its biochemical, developmental, or phenotypiccharacteristics, may be controlled through a number of cellularprocesses. One important way to manipulate that control is throughtranscription factors—proteins that influence the expression of aparticular gene or sets of genes. Transformed and transgenic plants thatcomprise cells having altered levels of at least one selectedtranscription factor, for example, possess advantageous or desirabletraits. Strategies for manipulating traits by altering a plant cell'stranscription factor content can therefore result in plants and cropswith new and/or improved commercially valuable properties.

Transcription factors can modulate gene expression, either increasing ordecreasing (inducing or repressing) the rate of transcription. Thismodulation results in differential levels of gene expression at variousdevelopmental stages, in different tissues and cell types, and inresponse to different exogenous (e.g., environmental) and endogenousstimuli throughout the life cycle of the organism.

Because transcription factors are key controlling elements of biologicalpathways, altering the expression levels of one or more transcriptionfactors can change entire biological pathways in an organism. Forexample, manipulation of the levels of selected transcription factorsmay result in increased expression of economically useful proteins orbiomolecules in plants or improvement in other agriculturally relevantcharacteristics. Conversely, blocked or reduced expression of atranscription factor may reduce biosynthesis of unwanted compounds orremove an undesirable trait. Therefore, manipulating transcriptionfactor levels in a plant offers tremendous potential in agriculturalbiotechnology for modifying a plant's traits. A number of theagriculturally relevant characteristics of plants, and desirable traitsthat may be imbued by gene expression are listed below.

Useful Plant Traits

Category: Abiotic Stress; Desired Trait: Chilling Tolerance

The term “chilling sensitivity” has been used to describe many types ofphysiological damage produced at low, but above freezing, temperatures.Most crops of tropical origins such as soybean, rice, maize and cottonare easily damaged by chilling. Typical chilling damage includeswilting, necrosis, chlorosis or leakage of ions from cell membranes. Theunderlying mechanisms of chilling sensitivity are not completelyunderstood yet, but probably involve the level of membrane saturationand other physiological deficiencies. For example, photoinhibition ofphotosynthesis (disruption of photosynthesis due to high lightintensities) often occurs under clear atmospheric conditions subsequentto cold late summer/autumn nights. By some estimates, chilling accountsfor monetary losses in the United States (US) second only to drought andflooding. For example, chilling may lead to yield losses and lowerproduct quality through the delayed ripening of maize. Anotherconsequence of poor growth is the rather poor ground cover of maizefields in spring, often resulting in soil erosion, increased occurrenceof weeds, and reduced uptake of nutrients. A retarded uptake of mineralnitrogen could also lead to increased losses of nitrate into the groundwater.

Category: Abiotic Stress; Desired Trait: Freezing Tolerance.

Freezing is a major environmental stress that limits where crops can begrown and reduces yields considerably, depending on the weather in aparticular growing season. In addition to exceptionally stressful yearsthat cause measurable losses of billions of dollars, less extreme stressalmost certainly causes smaller yield reductions over larger areas toproduce yield reductions of similar dollar value every year. Forinstance, in the US, the 1995 early fall frosts are estimated to havecaused losses of over one billion dollars to corn and soybeans. Thespring of 1998 saw an estimated $200 M of damages to Georgia alone, inthe peach, blueberry and strawberry industries. The occasional freezesin Florida have shifted the citrus belt further south due to $100 M ormore losses. California sustained $650 M of damage in 1998 to the citruscrop due to a winter freeze. In addition, certain crops such asEucalyptus, which has the very favorable properties of rapid growth andgood wood quality for pulping, are not able to grow in the southeasternstates due to occasional freezes.

Inherent winter hardiness of the crop determines in which agriculturalareas it can survive the winter. For example, for wheat, the northerncentral portion of the US has winters that are too cold for good winterwheat crops. Approximately 20% of the US wheat crop is spring wheat,with a market value of $2 billion. Areas growing spring wheat couldbenefit by growing winter wheat that had increased winter hardiness.Assuming a 25% yield increase when growing winter wheat, this wouldcreate $500 M of increased value. Additionally, the existing winterwheat is severely stressed by freezing conditions and should haveimproved yields with increased tolerance to these stresses. An estimateof the yield benefit of these traits is 10% of the $4.4 billion winterwheat crop in the US or $444 M of yield increase, as well as bettersurvival in extreme freezing conditions that occur periodically.

Thus plants more resistant to freezing, both midwinter freezing andsudden freezes, would protect a farmers' investment, improve yield andquality, and allow some geographies to grow more profitable andproductive crops. Additionally, winter crops such as canola, wheat andbarley have 25% to 50% yield increases relative to spring plantedvarieties of the same crops. This yield increase is due to the “headstart” the fall planted crop has over the spring planted crop and itsreaching maturity earlier while the temperatures, soil moisture and lackof pathogens provide more favorable conditions.

Category: Abiotic Stress: Desired Trait: Salt Tolerance.

One in five hectares of irrigated land is damaged by salt, an importanthistorical factor in the decline of ancient agrarian societies. Thiscondition is only expected to worsen, further reducing the availabilityof arable land and crop production, since none of the top five foodcrops—wheat, corn, rice, potatoes, and soybean—can tolerate excessivesalt.

Detrimental effects of salt on plants are a consequence of both waterdeficit resulting in osmotic stress (similar to drought stress) and theeffects of excess sodium ions on critical biochemical processes. As withfreezing and drought, high saline causes water deficit; the presence ofhigh salt makes it difficult for plant roots to extract water from theirenvironment (Buchanan et al. (2000) in Biochemistry and MolecularBiology of Plants, American Society of Plant Physiologists, Rockville,Md.). Soil salinity is thus one of the more important variables thatdetermines where a plant may thrive. In many parts of the world, sizableland areas are uncultivable due to naturally high soil salinity. Tocompound the problem, salination of soils that are used for agriculturalproduction is a significant and increasing problem in regions that relyheavily on agriculture. The latter is compounded by over-utilization,over-fertilization and water shortage, typically caused by climaticchange and the demands of increasing population. Salt tolerance is ofparticular importance early in a plant's lifecycle, since evaporationfrom the soil surface causes upward water movement, and salt accumulatesin the upper soil layer where the seeds are placed. Thus, germinationnormally takes place at a salt concentration much higher than the meansalt level in the whole soil profile.

Category: Abiotic Stress; Desired Trait: Drought Tolerance.

While much of the weather that we experience is brief and short-lived,drought is a more gradual phenomenon, slowly taking hold of an area andtightening its grip with time. In severe cases, drought can last formany years, and can have devastating effects on agriculture and watersupplies. With burgeoning population and chronic shortage of availablefresh water, drought is not only the number one weather related problemin agriculture, it also ranks as one of the major natural disasters ofall time, causing not only economic damage, but also loss of humanlives. For example, losses from the US drought of 1988 exceeded $40billion, exceeding the losses caused by Hurricane Andrew in 1992, theMississippi River floods of 1993, and the San Francisco earthquake in1989. In some areas of the world, the effects of drought can be far moresevere. In the Horn of Africa the 1984-1985 drought led to a famine thatkilled 750,000 people.

Problems for plants caused by low water availability include mechanicalstresses caused by the withdrawal of cellular water. Drought also causesplants to become more susceptible to various diseases (Simpson (1981).“The Value of Physiological Knowledge of Water Stress in Plants”, InWater Stress on Plants, (Simpson, G. M., ed.), Praeger, NY, pp.235-265).

In addition to the many land regions of the world that are too arid formost if not all crop plants, overuse and over-utilization of availablewater is resulting in an increasing loss of agriculturally-usable land,a process which, in the extreme, results in desertification. The problemis further compounded by increasing salt accumulation in soils, asdescribed above, which adds to the loss of available water in soils.

Category: Abiotic Stress; Desired Trait: Heat Tolerance.

Germination of many crops is very sensitive to temperature. Atranscription factor that would enhance germination in hot conditionswould be useful for crops that are planted late in the season or in hotclimates.

Seedlings and mature plants that are exposed to excess heat mayexperience heat shock, which may arise in various organs, includingleaves and particularly fruit, when transpiration is insufficient toovercome heat stress. Heat also damages cellular structures, includingorganelles and cytoskeleton, and impairs membrane function (Buchanan,supra).

Heat shock may result a decrease in overall protein synthesis,accompanied by expression of heat shock proteins. Heat shock proteinsfunction as chaperones and are involved in refolding proteins denaturedby heat.

Category: Abiotic Stress; Desired Trait: Tolerance to Low Nitrogen andPhosphorus.

The ability of all plants to remove nutrients from their environment isessential to survival. Thus, identification of genes that encodepolypeptides with transcription factor activity may allow for thegeneration of transgenic plants that are better able to make use ofavailable nutrients in nutrient-poor environments.

Among the most important macronutrients for plant growth that have thelargest impact on crop yield are nitrogenous and phosphorus-containingcompounds. Nitrogen- and phosphorus-containing fertilizers are usedintensively in agriculture practices today. An increase in grain cropyields from 0.5 to 1.0 metric tons per hectare to 7 metric tons perhectare accompanied the use of commercial fixed nitrogen fertilizer inproduction farming (Vance (2001) Plant Physiol. 127: 390-397). Givencurrent practices, in order to meet food production demands in years tocome, considerable increases in the amount of nitrogen- andphosphorus-containing fertilizers will be required (Vance, supra).

Nitrogen is the most abundant element in the Earth's atmosphere yet itis one of the most limiting elements to plant growth due to its lack ofavailability in the soil. Plants obtain N from the soil from severalsources including commercial fertilizers, manure and the mineralizationof organic matter. The intensive use of N fertilizers in presentagricultural practices is problematic, the energy intensive Haber-Boschprocess makes N fertilizer and it is estimated that the US uses annuallybetween 3-5% of the nation's natural gas for this process. In additionto the expense of N fertilizer production and the depletion ofnon-renewable resources, the use of N fertilizers has led to theeutrophication of freshwater ecosystems and the contamination ofdrinking water due to the runoff of excess fertilizer into ground watersupplies.

Phosphorus is second only to N in its importance as a macronutrient forplant growth and to its impact on crop yield. Phosphorus (P) isextremely immobile and not readily available to roots in the soil and istherefore often growth limiting to plants. Inorganic phosphate (Pi) is aconstituent of several important molecules required for energy transfer,metabolic regulation and protein activation (Marschner (1995) MineralNutrition of Higher Plants, 2nd ed., Academic Press, San Diego, Calif.).Plants have evolved several strategies to help cope with P and Ndeprivation that include metabolic as well as developmental adaptations.Most, if not all, of these strategies have components that are regulatedat the level of transcription and therefore are amenable to manipulationby transcription factors. Metabolic adaptations include increasing theavailability of P and N by increasing uptake from the soil though theinduction of high affinity and low affinity transporters, and/orincreasing its mobilization in the plant. Developmental adaptationsinclude increases in primary and secondary roots, increases in root hairnumber and length, and associations with mycorrhizal fungi (Bates andLynch (1996) Plant Cell Environ. 19: 529-538; Harrison (1999) Annu. Rev.Plant Physiol. Plant Mol. Biol. 50: 361-389).

Category: Biotic Stress; Desired Trait: Disease Resistance.

Disease management is a significant expense in crop productionworldwide. According to EPA reports for 1996 and 1997, US farmers spendapproximately $6 billion on fungicides annually. Despite thisexpenditure, according to a survey conducted by the food and agricultureorganization, plant diseases still reduce worldwide crop productivity by12% and in the United States alone, economic losses due to plantpathogens amounts to 9.1 billion dollars (FAO, 1993). Data from thesereports and others demonstrate that despite the availability of chemicalcontrol only a small proportion of the losses due to disease can beprevented. Not only are fungicides and anti-bacterial treatmentsexpensive to growers, but their widespread application poses bothenvironmental and health risks. The use of plant biotechnology toengineer disease resistant crops has the potential to make a significanteconomic impact on agriculture and forestry industries in two ways:reducing the monetary and environmental expense of fungicide applicationand reducing both pre-harvest and post-harvest crop losses that occurnow despite the use of costly disease management practices.

Fungal, bacterial, oomycete, viral, and nematode diseases of plants areubiquitous and important problems, and often severely impact yield andquality of crop and other plants. A very few examples of diseases ofplants include:

Powdery mildew, caused by the fungi Erysiphe, Sphaerotheca,Phyllactinia, Microsphaera, Podosphaera, or Uncinula, in, for example,wheat, bean, cucurbit, lettuce, pea, grape, tree fruit crops, as well asroses, phlox, lilacs, grasses, and Euonymus;

Fusarium-caused diseases such as Fusarium wilt in cucurbits, Fusariumhead blight in barley and wheat, wilt and crown and root rot intomatoes;

Sudden oak death, caused by the oomycete Phytophthora ramorum; thisdisease was first detected in 1995 in California tan oaks. The diseasehas since killed more than 100,000 tan oaks, coast live oaks, blackoaks, and Shreve's oaks in coastal regions of northern California, andmore recently in southwestern Oregon (Roach (2001) National GeographicNews, Dec. 6, 2001);

Black Sigatoka, a fungal disease caused by Mycosphaerella species thatattacks banana foliage, is spreading throughout the regions of the worldthat are responsible for producing most of the world's banana crop;

Eutypa dieback, caused by Eutypa lata, affects a number of crop plants,including vine grape. Eutypa dieback delays shoot emergence, and causeschlorosis, stunting, and tattering of leaves;

Pierce's disease, caused by the bacterium Xylella fastidiosa, precludesgrowth of grapes in the southeastern United States, and threatens theprofitable wine grape industry in northern California. The bacteriumclogs the vasculature of the grapevines, resulting in foliar scorchingfollowed by slow death of the vines. There is no known treatment forPierce's disease;

Bacterial Spot caused by the bacterium Xanthomonas campestris causesserious disease problems on tomatoes and peppers. It is a significantproblem in the Florida tomato industry because it spreads rapidly,especially in warm periods where there is wind-driven rain. Under theseconditions, there are no adequate control measures;

Diseases caused by viruses of the family Geminiviridae are a growingagricultural problem worldwide. Geminiviruses have caused severe croplosses in tomato, cassaya, and cotton. For instance, in the 1991-1992growing season in Florida, geminiviruses caused $140 million in damagesto the tomato crop (Moffat (1991) Science 286: 1835). Geminiviruses havethe ability to recombine between strains to rapidly produce new virulentvarieties. Therefore, there is a pressing need for broad-spectrumgeminivirus control;

The soybean cyst nematode, Heterodera glycines, causes stunting andchlorosis of soybean plants, which results in yield losses or plantdeath from severe infestation. Annual losses in the United States havebeen estimated at $1.5 billion (University of Minnesota ExtensionService).

The aforementioned pathogens represent a very small fraction of diversespecies that seriously affect plant health and yield. For a morecomplete description of numerous plant diseases, see, for example,Vidhyasekaran (1997) Fungal Pathogenesis in Plants and Crops: MolecularBiology and Host Defense Mechanisms, Marcel Deckker, Monticello, N.Y.),or Agrios (1997) Plant Pathology, Academic Press, New York, N.Y.).Plants that are able to resist disease may produce significantly higheryields and improved food quality. It is thus of considerable importanceto find genes that reduce or prevent disease.

Category: Light Response; Desired Trait: Reduced Shade Avoidance.

Shade avoidance describes the process in which plants grown in closeproximity attempt to out-compete each other by increasing stem length atthe expense of leaf, fruit and storage organ development. This is causedby the plant's response to far-red radiation reflected from leaves ofneighboring plants, which is mediated by phytochrome photoreceptors.Close proximity to other plants, as is produced in high-density cropplantings, increases the relative proportion of far-red irradiation, andtherefore induces the shade avoidance response. Shade avoidanceadversely affects biomass and yield, particularly when leaves, fruits orother storage organs constitute the desired crop (see, for example,Smith (1982) Annu. Rev. Plant Physiol. 33: 481-518; Ballare et al.(1990) Science 247: 329-332; Smith (1995) Annu. Dev. Plant Physiol. Mol.Biol., 46: 289-315; and Schmitt et al. (1995), American Naturalist, 146:937-953). Alteration of the shade avoidance response in tobacco throughalteration of phytochrome levels has been shown to produce an increasein harvest index (leaf biomass/total biomass) at high planting density,which would result in higher yield (Robson et al. (1996) NatureBiotechnol. 14: 995-998).

Category: Flowering Time; Desired Trait: Altered Flowering Time andFlowering Control.

Timing of flowering has a significant impact on production ofagricultural products. For example, varieties with different floweringresponses to environmental cues are necessary to adapt crops todifferent production regions or systems. Such a range of varieties havebeen developed for many crops, including wheat, corn, soybean, andstrawberry. Improved methods for alteration of flowering time willfacilitate the development of new, geographically adapted varieties.

Breeding programs for the development of new varieties can be limited bythe seed-to-seed cycle. Thus, breeding new varieties of plants withmulti-year cycles (such as biennials, e.g. carrot, or fruit trees, suchas citrus) can be very slow. With respect to breeding programs, therewould be a significant advantage in having commercially valuable plantsthat exhibit controllable and modified periods to flowering (“floweringtimes”). For example, accelerated flowering would shorten crop and treebreeding programs.

Improved flowering control allows more than one planting and harvest ofa crop to be made within a single season. Early flowering would alsoimprove the time to harvest plants in which the flower portion of theplant constitutes the product (e.g., broccoli, cauliflower, and otheredible flowers). In addition, chemical control of flowering throughinduction or inhibition of flowering in plants could provide asignificant advantage to growers by inducing more uniform fruitproduction (e.g., in strawberry)

A sizable number of plants for which the vegetative portion of the plantforms the valuable crop tend to “bolt” dramatically (e.g., spinach,onions, lettuce), after which biomass production declines and productquality diminishes (e.g., through flowering-triggered senescence ofvegetative parts). Delay or prevention of flowering may also reduce orpreclude dissemination of pollen from transgenic plants.

Category: Growth Rate; Desired Trait: Modified Growth Rate.

For almost all commercial crops, it is desirable to use plants thatestablish more quickly, since seedlings and young plants areparticularly susceptible to stress conditions such as salinity ordisease. Since many weeds may outgrow young crops or out-compete themfor nutrients, it would also be desirable to determine means forallowing young crop plants to out compete weed species. Increasingseedling growth rate (emergence) contributes to seedling vigor andallows for crops to be planted earlier in the season with less concernfor losses due to environmental factors. Early planting helps add daysto the critical grain-filling period and increases yield.

Providing means to speed up or slow down plant growth would also bedesirable to ornamental horticulture. If such means be provided, slowgrowing plants may exhibit prolonged pollen-producing or fruitingperiod, thus improving fertilization or extending harvesting season.

Category: Growth Rate; Desired Trait: Modified Senescence and CellDeath.

Premature senescence, triggered by various plant stresses, can limitproduction of both leaf biomass and seed yield. Transcription factorgenes that suppress premature senescence or cell death in response tostresses can provide means for increasing yield. Delay of normaldevelopmental senescence could also enhance yield, particularly forthose plants for which the vegetative part of the plant represents thecommercial product (e.g., spinach, lettuce).

Although leaf senescence is thought to be an evolutionary adaptation torecycle nutrients, the ability to control senescence in an agriculturalsetting has significant value. For example, a delay in leaf senescencein some maize hybrids is associated with a significant increase inyields and a delay of a few days in the senescence of soybean plants canhave a large impact on yield. In an experimental setting, tobacco plantsengineered to inhibit leaf senescence had a longer photosyntheticlifespan, and produced a 50% increase in dry weight and seed yield (Ganand Amasino (1995) Science 270: 1986-1988). Delayed flower senescencemay generate plants that retain their blossoms longer and this may be ofpotential interest to the ornamental horticulture industry, and delayedfoliar and fruit senescence could improve post-harvest shelf-life ofproduce.

Further, programmed cell death plays a role in other plant responses,including the resistance response to disease, and some symptoms ofdiseases, for example, as caused by necrotrophic pathogens such asBotrytis cinerea and Sclerotinia sclerotiorum (Dickman et al. Proc.Natl. Acad. Sci., 98: 6957-6962). Localized senescence and/or cell deathcan be used by plants to contain the spread of harmful microorganisms. Aspecific localized cell death response, the “hypersensitive response”,is a component of race-specific disease resistance mediated by plantresistance genes. The hypersensitive response is thought to help limitpathogen growth and to initiate a signal transduction pathway that leadsto the induction of systemic plant defenses. Accelerated senescence maybe a defense against obligate pathogens, such as powdery mildew, thatrely on healthy plant tissue for nutrients. With regard to powderymildew, Botrytis cinerea and Sclerotinia sclerotiorum and otherpathogens, transcription factors that ameliorate cell death and/ordamage may reduce the significant economic losses encountered, such as,for example, Botrytis cinerea in strawberry and grape.

Category: Growth Regulator; Desired Trait: Altered Sugar Sensing

Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, flowering, senescence,sugar metabolism and photosynthesis. Sucrose, for example, is the majortransport form of photosynthate and its flux through cells has beenshown to affect gene expression and alter storage compound accumulationin seeds (source-sink relationships). Glucose-specific hexose-sensinghas also been described in plants and is implicated in cell division andrepression of “famine” genes (photosynthetic or glyoxylate cycles).

Category: Morphology; Desired Trait: Altered Morphology

Trichomes are branched or unbranched epidermal outgrowths or hairstructures on a plant. Trichomes produce a variety of secondarybiochemicals such as diterpenes and waxes, the former being importantas, for example, insect pheromones, and the latter as protectantsagainst desiccation and herbivorous pests. Since diterpenes also havecommercial value as flavors, aromas, pesticides and cosmetics, andpotential value as anti-tumor agents and inflammation-mediatingsubstances, they have been both products and the target of considerableresearch. In most cases where the metabolic pathways are impossible toengineer, increasing trichome density or size on leaves may be the onlyway to increase plant productivity. Thus, it would be advantageous todiscover trichome-affecting transcription factor genes for the purposeof increasing trichome density, size, or type to produce plants that arebetter protected from insects or that yield higher amounts of secondarymetabolites.

The ability to manipulate wax composition, amount, or distribution couldmodify plant tolerance to drought and low humidity or resistance toinsects, as well as plant appearance. In particular, a possibleapplication for a transcription factor gene that reduces wax productionin sunflower seed coats would be to reduce fouling during seed oilprocessing. Antisense or co-suppression of transcription factorsinvolved in wax biosynthesis in a tissue specific manner can be used tospecifically alter wax composition, amount, or distribution in thoseplants and crops from which wax is either a valuable attribute orproduct or an undesirable constituent of plants.

Other morphological characteristics that may be desirable in plantsinclude those of an ornamental nature. These include changes in seedcolor, overall color, leaf and flower shape, leaf color, leaf size, orglossiness of leaves. Plants that produce dark leaves may have benefitsfor human health; flavonoids, for example, have been used to inhibittumor growth, prevent of bone loss, and prevention lipid oxidation inanimals and humans. Plants in which leaf size is increased would likelyprovide greater biomass, which would be particularly valuable for cropsin which the vegetative portion of the plant constitutes the product.Plants with glossy leaves generally produce greater epidermal wax,which, if it could be augmented, resulted in a pleasing appearance formany ornamentals, help prevent desiccation, and resist herbivorousinsects and disease-causing agents. Changes in plant or plant partcoloration, brought about by modifying, for example, anthocyanin levels,would provide novel morphological features.

In many instances, the seeds of a plant constitute a valuable crop.These include, for example, the seeds of many legumes, nuts and grains.The discovery of means for producing larger seed would providesignificant value by bringing about an increase in crop yield.

Plants with altered inflorescence, including, for example, largerflowers or distinctive floral configurations, may have high value in theornamental horticulture industry.

Modifications to flower structure may have advantageous or deleteriouseffects on fertility, and could be used, for example, to decreasefertility by the absence, reduction or screening of reproductivecomponents. This could be a desirable trait, as it could be exploited toprevent or minimize the escape of the pollen of genetically modifiedorganisms into the environment.

Manipulation of inflorescence branching patterns may also be used toinfluence yield and offer the potential for more effective harvestingtechniques. For example, a “self pruning” mutation of tomato results ina determinate growth pattern and facilitates mechanical harvesting(Pnueli et al. (2001) Plant Cell 13(12): 2687-2702).

Alterations of apical dominance or plant architecture could create newplant varieties. Dwarf plants may be of potential interest to theornamental horticulture industry.

Category: Seed Biochemistry; Desired Trait: Altered Seed Oil

The composition of seeds, particularly with respect to seed oil quantityand/or composition, is very important for the nutritional value andproduction of various food and feed products. Desirable improvements tooils include enhanced heat stability, improved nutritional qualitythrough, for example, reducing the number of calories in seed,increasing the number of calories in animal feeds, or altering the ratioof saturated to unsaturated lipids comprising the oils.

Category: Seed Biochemistry; Desired Trait: Altered Seed Protein

As with seed oils, seed protein content and composition is veryimportant for the nutritional value and production of various food andfeed products. Altered protein content or concentration in seeds may beused to provide nutritional benefits, and may also prolong storagecapacity, increase seed pest or disease resistance, or modifygermination rates. Altered amino acid composition of seeds, throughaltered protein composition, is also a desired objective for nutritionalimprovement.

Category: Seed Biochemistry; Desired Trait: Altered Prenyl Lipids.

Prenyl lipids, including the tocopherols, play a role in anchoringproteins in membranes or membranous organelles. Tocopherols have bothanti-oxidant and vitamin E activity. Modified tocopherol composition ofplants may thus be useful in improving membrane integrity and function,which may mitigate abiotic stresses such as heat stress. Increasing theanti-oxidant and vitamin content of plants through increased tocopherolcontent can provide useful human health benefits.

Category: Leaf Biochemistry Desired Trait; Altered Glucosinolate Levels

Increases or decreases in specific glucosinolates or total glucosinolatecontent can be desirable depending upon the particular application. Forexample: (i) glucosinolates are undesirable components of the oilseedsused in animal feed, since they produce toxic effects; low-glucosinolatevarieties of canola have been developed to combat this problem; (ii)some glucosinolates have anti-cancer activity; thus, increasing thelevels or composition of these compounds can be of use in production ofnutraceuticals; and (iii) glucosinolates form part of a plant's naturaldefense against insects; modification of glucosinolate composition orquantity could therefore afford increased protection from herbivores.Furthermore, tissue specific promoters can be used in edible crops toensure that these compounds accumulate specifically in particulartissues, such as the epidermis, which are not taken for humanconsumption.

Category: Leaf Biochemistry; Desired Trait: Flavonoid Production.

Expression of transcription factors that increase flavonoid productionin plants, including anthocyanins and condensed tannins, may be used toalter pigment production for horticultural purposes, and possibly toincrease stress resistance. Flavonoids have antimicrobial activity andcould be used to engineer pathogen resistance. Several flavonoidcompounds have human health promoting effects such as inhibition oftumor growth, prevention of bone loss and prevention of lipid oxidation.Increased levels of condensed tannins in forage legumes would provideagronomic benefits in ruminants by preventing pasture bloat bycollapsing protein foams within the rumen. For a review on the utilitiesof flavonoids and their derivatives, see Dixon et al. (1999) TrendsPlant Sci. 4: 394-400.

The present invention relates to methods and compositions for producingtransgenic plants with modified traits, particularly traits that addressthe agricultural and food needs described in the above backgroundinformation. These traits may provide significant value in that theyallow the plant to thrive in hostile environments, where, for example,temperature, water and nutrient availability or salinity may limit orprevent growth of non-transgenic plants. The traits may also comprisedesirable morphological alterations, larger or smaller size, disease andpest resistance, alterations in flowering time, light response, andothers.

We have identified polynucleotides encoding transcription factors,developed numerous transgenic plants using these polynucleotides, andhave analyzed the plants for a variety of important traits. In so doing,we have identified important polynucleotide and polypeptide sequencesfor producing commercially valuable plants and crops as well as themethods for making them and using them. Other aspects and embodiments ofthe invention are described below and can be derived from the teachingsof this disclosure as a whole.

SUMMARY OF THE INVENTION

The present invention pertains to transgenic plants that comprise arecombinant polynucleotide that includes a nucleotide sequence encodinga HAP3, HAP3-like or HAP5 CCAAT transcription factor with the ability toregulate abiotic stress tolerance in a plant.

Transgenic plants and methods for producing transgenic plants areprovided. The transgenic plants comprise a recombinant polynucleotidehaving a polynucleotide sequence, or a sequence that is complementary tothis polynucleotide sequence, that encodes a transcription factor.

The present invention is directed to transgenic plants with alteredexpression of a transcription factor polypeptide. When these plants arecompared to non-transgenic plants or wild-type control plants of thesame species (that is, those that do not have the expression of thetranscription factor polypeptide altered), the transgenic plants areshown to have increased tolerance to a variety of abiotic stresses.These may include, for example, osmotic stress, drought, salt stress,heat stress, or cold stress. A sizeable number of transcription factorsequences, which may be found in the Sequence Listing, have been used toconfer these advantages. For example, the transgenic plant may compriseas part of its genome a transgene encoding a polypeptide member of theCCAAT-box binding transcription factor family or the MYB-relatedtranscription factor family. By overexpressing these polypeptidemembers, the plant becomes more abiotic stress tolerant wild-typecontrols.

In a further refinement of the invention, the polypeptide member may beof the CCAAT box-binding transcription factor family, and morespecifically, from the G482 subclade of the non-LEC1-like clade ofproteins of the L1L-related CCAAT transcription factor family. Thesepolypeptides comprise a B domain (B domains are further characterized inthe specification that follows, with examples from various plant specieslisted). The B domains of these polypeptides are able to bind aregulatory region of DNA comprising the motif CCAAT, and this binding isresponsible for the regulation of transcription of that DNA. Thisregulation of transcription confers increased abiotic stress tolerancein the transgenic plant.

Alternatively, the polypeptide member may be a MYB-related transcriptionfactor, and more specifically a member of the G682 subclade ofMYB-related transcription factors. These polypeptides comprise aMYB-related domain (MYB-related domains further characterized in thespecification that follows, with examples from various plant specieslisted). The MYB-related domains of these polypeptides are able to binda regulatory region of DNA, and this binding is responsible for theregulation of transcription of that DNA. Similar to the regulation oftranscription by CCAAT family transcription factors, increased abioticstress tolerance in the transgenic plant is achieved by the binding ofthe MYB-related transcription factor to the DNA.

In another alternative, the polypeptide member may be a member of a widevariety of transcription factor families and groups, exemplified by thetranscription factor polypeptides of the invention that have been shownto confer useful advantages or traits in plants when the expression ofthese polypeptides is altered in the plants.

The transgenic plants of the invention may be either dicotyledonous ormonocotyledonous, and the polynucleotide and polypeptide sequences ofthe invention may be derived from dicotyledonous or monocotyledonousplants, or be creations that are structurally and functionally similar.The transgenic plants that define the invention may be crossed withother plants or themselves to generate progeny plants, which may possessadvantageous characteristics such as improved stress tolerance. Seedderived from the transgenic plants of the invention are also encompassedwithin the scope of the invention.

The invention is also directed to methods for producing a transgenicplant having increased tolerance to abiotic stress by introducing anexpression vector that contains a polynucleotide sequence encoding apolypeptide of the invention into the plant.

The polypeptide may comprise, for example, a member of G482 subclade ofthe non-LEC1-like clade of proteins of the L1L-related CCAATtranscription factor family, or a member of the G682 subclade ofMYB-related transcription factors, or a member of any of the othertranscription factor families or groups of the invention that have beenshown to provide useful traits in plants when their expression isaltered. In the case where the polypeptide comprises a G482 subclademember, the polypeptide will comprise a B domain that is of sufficientlyhomology to the B domain of G481, SEQ ID NO: 88 that the B domainfunctions in the same manner as the G481 domain by binding to DNA at atranscription regulating region comprising the motif CCAAT, thusregulating transcription of the DNA and confers increased abiotic stresstolerance in the transgenic plant as compared to non-transgenic plantsof the same species.

Alternatively, the expression vector may also contain a polynucleotidesequence that encodes a polypeptide of the G682 subclade of MYB-relatedtranscription factors, the MYB-related domain of the polypeptide beingsufficiently homologous to the MYB-related domain of G682, SEQ ID NO:148, that the MYB-related domain binds to DNA at a transcriptionregulating region and brings about transcriptional and increased abioticstress tolerance in the transgenic plant.

Regardless of the nucleotide sequence used, the expression vector willalso contain one or more regulatory elements operably linked to thenucleotide sequence. It is through the regulatory elements thatexpression of the nucleotide sequence is controlled in a target plant.By introducing the expression vector into a cell of the target plant,growing the plant cell into a plant and allowing the plant tooverexpress the polypeptide, a plant with improved abiotic stresstolerance (relative to, for example, wild-type controls) is generated.The improved plants may be identified by comparison to with one or morecontrol plants.

Any plant with increased stress tolerance produced by this method may becrossed with itself or another plant. Seed that develops as a result ofthis crossing may then be selected, and a progeny plant grown from theseed. This would have the effect of producing a transgenic progeny plantthat would have increased tolerance to abiotic stress, as compared to anon-transgenic plant of the same species.

The invention also pertains to a method for increasing a plantstolerance to abiotic stress by providing a vector comprising anucleotide of the invention or a nucleotide encoding a polypeptide ofthe invention, and regulatory elements operably linked to the nucleotidesequence. These regulatory elements are able to control expression ofthe nucleotide sequence in a target plant. The target plant istransformed with the vector to generate a transformed plant withincreased tolerance to abiotic stress compared to non-transgenic plantsof the same species. The sequences used in this method may include, forexample, polynucleotide or polypeptide members of the G482 subclade ofthe non-LEC1-like clade of proteins of the L1L-related CCAATtranscription factor family, or of the G682 subclade of MYB-relatedtranscription factors. In the case of either of these examples, theconserved domain that is, the B or MYB-related domain, would besufficiently homologous to the corresponding domains of SEQ ID NO: 88 or148, G481 or G682, respectively, that the conserved domain binds to DNAat a transcription regulating region, and thus regulates transcriptionand increases abiotic stress tolerance in the plant as compared to anon-transformed or transgenic plant of the same species.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING AND DRAWINGS

The Sequence Listing provides exemplary polynucleotide and polypeptidesequences of the invention. The traits associated with the use of thesequences are included in the Examples.

CD-ROM1 (Copy 1) is a read-only memory computer-readable compact discand contains a copy of the Sequence Listing in ASCII text format. TheSequence Listing is named “MBI0047PCT.ST25.txt” and is 6,312 kilobytesin size. The copies of the Sequence Listing on the CD-ROM disc arehereby incorporated by reference in their entirety.

CD-ROM2 (Copy 2) is an exact copy of CD-R1 (Copy 1).

CD-ROM3 contains a computer-readable format (CRF) copy of the SequenceListing as a text (.txt) file.

FIG. 1 shows a conservative estimate of phylogenetic relationships amongthe orders of flowering plants (modified from Angiosperm Phylogeny Group(1998) Ann. Missouri Bot. Gard. 84: 1-49). Those plants with a singlecotyledon (monocots) are a monophyletic clade nested within at least twomajor lineages of dicots; the eudicots are further divided into rosidsand asterids. Arabidopsis is a rosid eudicot classified within the orderBrassicales; rice is a member of the monocot order Poales. FIG. 1 wasadapted from Daly et al. (2001) Plant Physiol. 127: 1328-1333.

FIG. 2 shows a phylogenic dendogram depicting phylogenetic relationshipsof higher plant taxa, including clades containing tomato andArabidopsis; adapted from Ku et al. (2000) Proc. Natl. Acad. Sci. 97:9121-9126; and Chase et al. (1993) Ann. Missouri Bot. Gard. 80: 528-580.

FIGS. 3A, and 3B show an alignment of G682 (SEQ ID NO: 148) andpolynucleotide sequences that are paralogous and orthologous to G682.The alignment was produced using MACVECTOR software (Accelrys, Inc., SanDiego, Calif.).

FIGS. 4A, 4B, 4C and 4D show an alignment of G867 (SEQ ID NO: 170) andpolynucleotide sequences that are paralogous and orthologous to G867.The alignment was produced using MACVECTOR software (Accelrys, Inc.).

FIGS. 5A, 5B, 5C, 5D, 5E and 5F show an alignment of G912 (SEQ ID NO:186) and polynucleotide sequences that are paralogous and orthologous toG912. The alignment was produced using MACVECTOR software (Accelrys,Inc.).

FIG. 6 is adapted from Kwong et al (2003) Plant Cell 15: 5-18, and showscrop sequences identified through BLAST analysis of various L1L-relatedsequences. A phylogeny tree was then generated using ClustalX based onwhole protein sequences showing the non-LEC1-like HAP3 clade oftranscription factors (large box). This clade contains Arabidopsissequences as well as members from diverse species that arephylogenetically distinct from the LEC1-like proteins (seen in the nexttwo figures). The smaller box delineates the G482-like subclade,containing transcription factors that are structurally most closelyrelated to G481 and G482.

Similar to FIG. 6, FIG. 7 shows the phylogenic relationship of sequenceswithin the G482-subclade (within the smaller box) and the non-LEC1-likeclade (larger box). The G482-subclade contains members from diversespecies, including Arabidopsis, soy, rice and corn sequences that arephylogenetically distinct from the LEC1-like proteins, some of which arealso shown in FIG. 6. Arabidopsis plants carrying a mutation in the LEC1gene display embryos that are intolerant to desiccation and that showdefects in seed maturation (Lotan et al. (1998) Cell 93: 1195-1205). Wehave shown that the G482 subclade members confer abiotic stresstolerance. Given their phylogenetic divergence, it is possible thatLEC1-like proteins seen below the G482 subclade have evolved to conferdesiccation tolerance specifically within the embryo, whereas other G482subclade evolved to confer abiotic stress tolerance in non-embryonictissues.

For FIG. 8, a phylogenic relationship of sequences within theG482-subclade was generated with different method than used to createFIG. 7. In this case, the phylogenetic tree and multiple sequencealignments of G481 and related full length proteins were constructedusing ClustalW (CLUSTAL W Multiple Sequence Alignment Program version1.83, 2003) and MEGA2 (http://www.megasoftware.net) software. TheClustalW multiple alignment parameters were:

Gap Opening Penalty: 10.00

Gap Extension Penalty: 0.20

Delay divergent sequences: 30%

DNA Transitions Weight: 0.50

Protein weight matrix: Gonnet series

DNA weight matrix: IUB

Use negative matrix: OFF.

A FastA formatted alignment was then used to generate a phylogenetictree in MEGA2 using the neighbor joining algorithm and a p-distancemodel. A test of phylogeny was done via bootstrap with 100 replicationsand Random Speed set to default. Cut off values of the bootstrap treewere set to 50%. The G482 subclade of the non-LEC1-like clade ofproteins of the L1L-related CCAAT transcription factor family (box), arederived from a common single node (arrow) and a representative number,including Arabidopsis sequences G481, G482, G485, G1364 and G2345, soysequences G3470, G3471, G3472, G3475, G3476, rice sequences G3395,G3397, G3398, G3429, and corn sequences G3434, and G3436, have beenshown to confer abiotic stress tolerance in plants (see Example XIII).

FIG. 9 shows the domain structure of HAP3 proteins. HAP3 proteinscontain an amino-terminal A domain, a central B domain, and acarboxy-terminal C domain. There may be relatively little sequencesimilarity between HAP3 proteins in the A and C domains. The A and Cdomains could thus provide a degree of specificity to each member of theHAP3 family. The B domain is the conserved region that specifies DNAbinding and subunit association.

In FIGS. 10A-10F, the alignments of HAP3 polypeptides are presented,including G481, G482, G485, G1364, G2345, G1781 and related sequencesfrom Arabidopsis aligned with soybean, rice and corn sequences, showingthe B domains (indicated by the line that spans FIGS. 10B through 10D).Consensus residues within the listed sequences are indicated byboldface. The boldfaced residues in the consensus sequence that appearsat the bottom of FIGS. 10A through 10C in their respective positions areuniquely found in the non-LEC1-like clade. The underlined serine residueappearing in the consensus sequence in its respective positions isuniquely found within the G482-like subclade. As discussed in greaterdetail below in Example III, the residue positions indicated by thearrows in FIG. 10B are associated with an alteration of flowering timewhen these polypeptides are overexpressed.

FIG. 11 is a phylogenetic tree of defined conserved domains of G682 andrelated polypeptides constructed with ClustalW (CLUSTAL W MultipleSequence Alignment Program version 1.83, 2003) and MEGA2(http://www.megasoftware.net) software. ClustalW multiple alignmentparameters were as follows:

Gap Opening Penalty: 10.00

Gap Extension Penalty: 0.20

Delay divergent sequences: 30%

DNA Transitions Weight: 0.50

Protein weight matrix: Gonnet series

DNA weight matrix: IUB

Use negative matrix: OFF

A FastA formatted alignment was then used to generate a phylogenetictree in MEGA2 using the neighbor joining algorithm and a p-distancemodel. A test of phylogeny was done via bootstrap with 100 replicationsand Random Speed set to default. Cut off values of the bootstrap treewere set to 50%. The G682 subclade of MYB-related transcription factors,a group of structurally and functionally related sequences that derivefrom a single ancestral node (arrow), appears within the box in FIG. 11.

FIGS. 12A-12D show the effects of water deprivation and recovery fromthis treatment on Arabidopsis control and 35S::G481-overexpressinglines. After eight days of drought treatment overexpressing plants had adarker green and less withered appearance (FIG. 12C) than those in thecontrol group (FIG. 12A). The differences in appearance between thecontrol and G481-overexpressing plants after they were rewatered waseven more striking. Most (11 of 12 plants; exemplified in FIG. 12B) ofthis set of control plants died after rewatering, indicating theinability to recover following severe water deprivation, whereas allnine of the overexpressor plants of the line shown recovered from thisdrought treatment (exemplified in FIG. 12D). The results shown in FIGS.12A-12D were typical of a number of control and 35S::G481-overexpressinglines.

FIGS. 13A and 13B show the effects of salt stress on Arabidopsis seedgermination. The three lines of G481- and G482 overexpressors on thesetwo plate had longer roots and showed greater cotyledon expansion(arrows) after three days on 150 mM NaCl than the control seedlings onthe right-hand sides of the plates.

In FIG. 14A, G481 null mutant seedlings (labeled K481) show reducedtolerance of osmotic stress, relative to the control seedlings in FIG.14B, as evidenced by the reduced cotyledon expansion and root growth inthe former group. Without salt stress tolerance on control media, (FIGS.14C, G481 null mutants; and 14D, control seedlings), the knocked out andcontrol plants appear the same.

FIGS. 15A-15D show the effects of stress-related treatments on G485overexpressing seedlings (35S::G485 lines) in plate assays. In eachtreatment, including cold, high sucrose, high salt and ABA germinationassays, the overexpressors fared much better than the wild-type controlsexposed to the same treatments in FIGS. 15E-15H, respectively, asevidenced by the enhanced cotyledon expansion and root growth seen withthe overexpressing seedlings.

FIGS. 16A-16C depict the effects of G485 knockout and overexpression onflowering time and maturation. As seen in FIG. 16A, a T-DNA insertionknockout mutation containing a SALK_(—)062245 insertion was shown toflower several days later than wild-type control plants. The plants inFIG. 16A are shown 44 days after germination. FIG. 16C shows that G485primary transformants flowered distinctly earlier than wild-typecontrols. These plants are shown 24 days after germination. Theseeffects were observed in each of two independent T1 plantings derivedfrom separate transformation dates. Additionally, accelerated floweringwas also seen in plants that overexpressed G485 from a two componentsystem (35S::LexA; op-LexA::G485). These studies indicated that G485 isboth sufficient to act as a floral activator, and is also necessary inthat role within the plant. G485 overexpressor plants also matured andset siliques much more rapidly than wild type controls, as shown in FIG.16B with plants 39 days post-germination.

FIG. 17A shows the effects of a high salt medium on the germination of35S::G3392 (a rice G682 ortholog) line 322. In this figure, theseedlings appeared larger and greener than the wild-type Arabidopsisseedlings similarly treated in FIG. 17B.

FIG. 18A shows the effects of cold treatment on the germination of35S::G3450 (soy G682 ortholog) line 307. In this figure, the seedlingshad less anthocyanin than the wild-type Arabidopsis seedlings similarlytreated in FIG. 18B.

FIG. 19A compares the germination of 35S::G3431 (a corn G682 ortholog)line 327 Arabidopsis seedlings on a 9.4% sucrose germination plate withnon-transgenic control seedlings of the same species similarly treatedin FIG. 19B. The overexpressors were greener and had less anthocyaninthan the non-transgenic control plants of the same species.

In FIG. 20, the deleterious effects of a heat treatment are seen in thewild-type plants on the right, and to a much lesser extent in the plantsoverexpressing the soy G3450 sequence on the left.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

In an important aspect, the present invention relates to polynucleotidesand polypeptides, for example, for modifying phenotypes of plants.Throughout this disclosure, various information sources are referred toand/or are specifically incorporated. The information sources includescientific journal articles, patent documents, textbooks, and World WideWeb browser-inactive page addresses, for example. While the reference tothese information sources clearly indicates that they can be used by oneof skill in the art, each and every one of the information sources citedherein are specifically incorporated in their entirety, whether or not aspecific mention of “incorporation by reference” is noted. The contentsand teachings of each and every one of the information sources can berelied on and used to make and use embodiments of the invention.

As used herein and in the appended claims, the singular forms “a,” “an,”and “the” include plural reference unless the context clearly dictatesotherwise. Thus, for example, a reference to “a plant” includes aplurality of such plants, and a reference to “a stress” is a referenceto one or more stresses and equivalents thereof known to those skilledin the art, and so forth.

The polynucleotide sequences of the invention encode polypeptides thatare members of well-known transcription factor families, including planttranscription factor families, as disclosed in Tables 7-11. Generally,the transcription factors encoded by the present sequences are involvedin cellular metabolism, cell differentiation and proliferation and theregulation of growth. Accordingly, one skilled in the art wouldrecognize that by expressing the present sequences in a plant, one maychange the expression of autologous genes or induce the expression ofintroduced genes. By affecting the expression of similar autologoussequences in a plant that have the biological activity of the presentsequences, or by introducing the present sequences into a plant, one mayalter a plant's phenotype to one with improved traits. The sequences ofthe invention may also be used to transform a plant and introducedesirable traits not found in the wild-type cultivar or strain. Plantsmay then be selected for those that produce the most desirable degree ofover- or under-expression of target genes of interest and coincidenttrait improvement.

The sequences of the present invention may be from any species,particularly plant species, in a naturally occurring form or from anysource whether natural, synthetic, semi-synthetic or recombinant. Thesequences of the invention may also include fragments of the presentamino acid sequences. In this context, a “fragment” refers to a fragmentof a polypeptide sequence which is at least 5 to about 15 amino acids inlength, most preferably at least 14 amino acids, and which retain somebiological activity of a transcription factor. Where “amino acidsequence” is recited to refer to an amino acid sequence of a naturallyoccurring protein molecule, “amino acid sequence” and like terms are notmeant to limit the amino acid sequence to the complete native amino acidsequence associated with the recited protein molecule.

As one of ordinary skill in the art recognizes, transcription factorscan be identified by the presence of a region or domain of structuralsimilarity or identity to a specific consensus sequence or the presenceof a specific consensus DNA-binding site or DNA-binding site motif (see,for example, Riechmann et al. (2000) Science 290: 2105-2110). The planttranscription factors may belong to one of the following transcriptionfactor families: the AP2 (APETALA2) domain transcription factor family(Riechmann and Meyerowitz (1998) Biol. Chem. 379: 633-646); the MYBtranscription factor family (ENBib; Martin and Paz-Ares (1997) TrendsGenet. 13: 67-73); the MADS domain transcription factor family(Riechmann and Meyerowitz (1997) Biol. Chem. 378: 1079-1101); the WRKYprotein family (Ishiguro and Nakamura (1994) Mol. Gen. Genet. 244:563-571); the ankyrin-repeat protein family (Zhang et al. (1992) PlantCell 4: 1575-1588); the zinc finger protein (Z) family (Klug and Schwabe(1995) FASEB J. 9: 597-604); Takatsuji (1998) Cell. Mol. Life Sci.54:582-596); the homeobox (HB) protein family (Buerglin (1994) inGuidebook to the Homeobox Genes, Duboule (ed.) Oxford University Press);the CAAT-element binding proteins (Forsburg and Guarente (1989) GenesDev. 3: 1166-1178); the squamosa promoter binding proteins (SPB) (Kleinet al. (1996) Mol. Gen. Genet. 1996 250: 7-16); the NAM protein family(Souer et al. (1996) Cell 85: 159-170); the IAA/AUX proteins (Abel etal. (1995) J. Mol. Biol. 251: 533-549); the HLH/MYC protein family(Littlewood et al. (1994) Prot. Profile 1: 639-709); the DNA-bindingprotein (DBP) family (Tucker et al. (1994) EMBO J. 13: 2994-3002); thebZIP family of transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the Box P-binding protein (the BPF-1) family (da Costa e Silvaet al. (1993) Plant J. 4: 125-135); the high mobility group (HMG) family(Bustin and Reeves (1996) Prog. Nucl. Acids Res. Mol. Biol. 54: 35-100);the scarecrow (SCR) family (Di Laurenzio et al. (1996) Cell 86:423-433); the GF14 family (Wu et al. (1997) Plant Physiol. 114:1421-1431); the polycomb (PCOMB) family (Goodrich et al. (1997) Nature386: 44-51); the teosinte branched (TEO) family (Luo et al. (1996)Nature 383: 794-799); the ABI3 family (Giraudat et al. (1992) Plant Cell4: 1251-1261); the triple helix (TH) family (Dehesh et al. (1990)Science 250: 1397-1399); the EIL family (Chao et al. (1997) Cell 89:1133-44); the AT-HOOK family (Reeves and Nissen (1990) J. Biol. Chem.265: 8573-8582); the S1FA family (Zhou et al. (1995) Nucleic Acids Res.23: 1165-1169); the bZIPT2 family (Lu and Ferl (1995) Plant Physiol.109: 723); the YABBY family (Bowman et al. (1999) Development 126:2387-96); the PAZ family (Bohmert et al. (1998) EMBO J. 17: 170-80); afamily of miscellaneous (MISC) transcription factors including the DPBFfamily (Kim et al. (1997) Plant J. 11: 1237-1251) and the SPF1 family(Ishiguro and Nakamura (1994) Mol. Gen. Genet. 244: 563-571); the GARPfamily (Hall et al. (1998) Plant Cell 10: 925-936), the TUBBY family(Boggin et al (1999) Science 286: 2119-2125), the heat shock family (Wu(1995) Annu. Rev. Cell Dev. Biol, 11: 441-469), the ENBP family(Christiansen et al. (1996) Plant Mol. Biol. 32: 809-821), the RING-zincfamily (Jensen et al. (1998) FEBS Letters 436: 283-287), the PDBP family(Janik et al. (1989) Virology 168: 320-329), the PCF family (Cubas etal. Plant J. (1999) 18: 215-22), the SRS (SHI-related) family (Fridborget al. (1999) Plant Cell 11: 1019-1032), the CPP (cysteine-richpolycomb-like) family (Cvitanich et al. (2000) Proc. Natl. Acad. Sci.97: 8163-8168), the ARF (auxin response factor) family (Ulmasov et al.(1999) Proc. Natl. Acad. Sci. 96: 5844-5849), the SWI/SNF family(Collingwood et al. (1999) J. Mol. Endocrinol. 23: 255-275), the ACBFfamily (Seguin et al. (1997) Plant Mol. Biol. 35: 281-291), PCGL (CG-1like) family (da Costa e Silva et al. (1994) Plant Mol. Biol. 25:921-924) the ARID family (Vazquez et al. (1999) Development 126:733-742), the Jumonji family (Balciunas et al. (2000), Trends Biochem.Sci. 25: 274-276), the bZIP-NIN family (Schauser et al. (1999) Nature402: 191-195), the E2F family (Kaelin et al. (1992) Cell 70: 351-364)and the GRF-like family (Knaap et al. (2000) Plant Physiol. 122:695-704). As indicated by any part of the list above and as known in theart, transcription factors have been sometimes categorized by class,family, and sub-family according to their structural content andconsensus DNA-binding site motif, for example. Many of the classes andmany of the families and sub-families are listed here. However, theinclusion of one sub-family and not another, or the inclusion of onefamily and not another, does not mean that the invention does notencompass polynucleotides or polypeptides of a certain family orsub-family. The list provided here is merely an example of the types oftranscription factors and the knowledge available concerning theconsensus sequences and consensus DNA-binding site motifs that helpdefine them as known to those of skill in the art (each of thereferences noted above are specifically incorporated herein byreference). A transcription factor may include, but is not limited to,any polypeptide that can activate or repress transcription of a singlegene or a number of genes. This polypeptide group includes, but is notlimited to, DNA-binding proteins, DNA-binding protein binding proteins,protein kinases, protein phosphatases, protein methyltransferases,GTP-binding proteins, and receptors, and the like.

In addition to methods for modifying a plant phenotype by employing oneor more polynucleotides and polypeptides of the invention describedherein, the polynucleotides and polypeptides of the invention have avariety of additional uses. These uses include their use in therecombinant production (i.e., expression) of proteins; as regulators ofplant gene expression, as diagnostic probes for the presence ofcomplementary or partially complementary nucleic acids (including fordetection of natural coding nucleic acids); as substrates for furtherreactions, e.g., mutation reactions, PCR reactions, or the like; assubstrates for cloning e.g., including digestion or ligation reactions;and for identifying exogenous or endogenous modulators of thetranscription factors. A “polynucleotide” is a nucleic acid moleculecomprising a plurality of polymerized nucleotides, e.g., at least about15 consecutive polymerized nucleotides, optionally at least about 30consecutive nucleotides, at least about 50 consecutive nucleotides. Apolynucleotide may be a nucleic acid, oligonucleotide, nucleotide, orany fragment thereof. In many instances, a polynucleotide comprises anucleotide sequence encoding a polypeptide (or protein) or a domain orfragment thereof. Additionally, the polynucleotide may comprise apromoter, an intron, an enhancer region, a polyadenylation site, atranslation initiation site, 5′ or 3′ untranslated regions, a reportergene, a selectable marker, or the like. The polynucleotide can be singlestranded or double stranded DNA or RNA. The polynucleotide optionallycomprises modified bases or a modified backbone. The polynucleotide canbe, e.g., genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, aPCR product, a cloned DNA, a synthetic DNA or RNA, or the like. Thepolynucleotide can be combined with carbohydrate, lipids, protein, orother materials to perform a particular activity such as transformationor form a useful composition such as a peptide nucleic acid (FNA). Thepolynucleotide can comprise a sequence in either sense or antisenseorientations. “Oligonucleotide” is substantially equivalent to the termsamplimer, primer, oligomer, element, target, and probe and is preferablysingle stranded.

Definitions

A “recombinant polynucleotide” is a polynucleotide that is not in itsnative state, e.g., the polynucleotide comprises a nucleotide sequencenot found in nature, or the polynucleotide is in a context other thanthat in which it is naturally found, e.g., separated from nucleotidesequences with which it typically is in proximity in nature, or adjacent(or contiguous with) nucleotide sequences with which it typically is notin proximity. For example, the sequence at issue can be cloned into avector, or otherwise recombined with one or more additional nucleicacid.

An “isolated polynucleotide” is a polynucleotide whether naturallyoccurring or recombinant, that is present outside the cell in which itis typically found in nature, whether purified or not. Optionally, anisolated polynucleotide is subject to one or more enrichment orpurification procedures, e.g., cell lysis, extraction, centrifugation,precipitation, or the like.

A “polypeptide” is an amino acid sequence comprising a plurality ofconsecutive polymerized amino acid residues e.g., at least about 15consecutive polymerized amino acid residues, optionally at least about30 consecutive polymerized amino acid residues, at least about 50consecutive polymerized amino acid residues. In many instances, apolypeptide comprises a polymerized amino acid residue sequence that isa transcription factor or a domain or portion or fragment thereof. Atranscription factor can regulate gene expression and may increase ordecrease gene expression in a plant. Additionally, the polypeptide maycomprise 1) a localization domain, 2) an activation domain, 3) arepression domain, 4) an oligomerization domain, or 5) a DNA-bindingdomain, or the like. The polypeptide optionally comprises modified aminoacid residues, naturally occurring amino acid residues not encoded by acodon, non-naturally occurring amino acid residues.

A “recombinant polypeptide” is a polypeptide produced by translation ofa recombinant polynucleotide. A “synthetic polypeptide” is a polypeptidecreated by consecutive polymerization of isolated amino acid residuesusing methods well known in the art. An “isolated polypeptide,” whethera naturally occurring or a recombinant polypeptide, is more enriched in(or out of) a cell than the polypeptide in its natural state in awild-type cell, e.g., more than about 5% enriched, more than about 10%enriched, or more than about 20%, or more than about 50%, or more,enriched, i.e., alternatively denoted: 105%, 110%, 120%, 150% or more,enriched relative to wild type standardized at 100%. Such an enrichmentis not the result of a natural response of a wild-type plant.Alternatively, or additionally, the isolated polypeptide is separatedfrom other cellular components with which it is typically associated,e.g., by any of the various protein purification methods herein.

“Identity” or “similarity” refers to sequence similarity between twopolynucleotide sequences or between two polypeptide sequences, withidentity being a more strict comparison. The phrases “percent identity”and “% identity” refer to the percentage of sequence similarity found ina comparison of two or more polynucleotide sequences or two or morepolypeptide sequences. “Sequence similarity” refers to the percentsimilarity in base pair sequence (as determined by any suitable method)between two or more polynucleotide sequences. Two or more sequences canbe anywhere from 0-100% similar, or any integer value therebetween.Identity or similarity can be determined by comparing a position in eachsequence that may be aligned for purposes of comparison. When a positionin the compared sequence is occupied by the same nucleotide base oramino acid, then the molecules are identical at that position. A degreeof similarity or identity between polynucleotide sequences is a functionof the number of identical or matching nucleotides at positions sharedby the polynucleotide sequences. A degree of identity of polypeptidesequences is a function of the number of identical amino acids atpositions shared by the polypeptide sequences. A degree of homology orsimilarity of polypeptide sequences is a function of the number of aminoacids at positions shared by the polypeptide sequences.

“Alignment” refers to a number of DNA or amino acid sequences aligned bylengthwise comparison so that components in common (i.e., nucleotidebases or amino acid residues) may be visually and readily identified.The fraction or percentage of components in common is related to thehomology or identity between the sequences. Alignments such as those ofFIGS. 3A and B, 4A-D, 5A-F and 9A-F may be used to identify conserveddomains and relatedness within these domains. An alignment may suitablybe determined by means of computer programs known in the art, such asMACVECTOR software (1999) (Accelrys, Inc., San Diego, Calif.).

The terms “highly stringent” or “highly stringent condition” refer toconditions that permit hybridization of DNA strands whose sequences arehighly complementary, wherein these same conditions excludehybridization of significantly mismatched DNAs. Polynucleotide sequencescapable of hybridizing under stringent conditions with thepolynucleotides of the present invention may be, for example, variantsof the disclosed polynucleotide sequences, including allelic or splicevariants, or sequences that encode orthologs or paralogs of presentlydisclosed polypeptides. Nucleic acid hybridization methods are disclosedin detail by Kashima et al. (1985) Nature 313:402-404, and Sambrook etal. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y. (“Sambrook”); and by Haymeset al., “Nucleic Acid Hybridization: A Practical Approach”, IRL Press,Washington, D.C. (1985), which references are incorporated herein byreference.

In general, stringency is determined by the temperature, ionic strength,and concentration of denaturing agents (e.g., formamide) used in ahybridization and washing procedure (for a more detailed description ofestablishing and determining stringency, see below). The degree to whichtwo nucleic acids hybridize under various conditions of stringency iscorrelated with the extent of their similarity. Thus, similar nucleicacid sequences from a variety of sources, such as within a plant'sgenome (as in the case of paralogs) or from another plant (as in thecase of orthologs) that may perform similar functions can be isolated onthe basis of their ability to hybridize with known transcription factorsequences. Numerous variations are possible in the conditions and meansby which nucleic acid hybridization can be performed to isolatetranscription factor sequences having similarity to transcription factorsequences known in the art and are not limited to those explicitlydisclosed herein. Such an approach may be used to isolate polynucleotidesequences having various degrees of similarity with disclosedtranscription factor sequences, such as, for example, transcriptionfactors having 60% identity, or more preferably greater than about 70%identity, most preferably 72% or greater identity with disclosedtranscription factors.

The term “equivalog” describes members of a set of homologous proteinsthat are conserved with respect to function since their last commonancestor. Related proteins are grouped into equivalog families, andotherwise into protein families with other hierarchically definedhomology types. This definition is provided at the Institute for GenomicResearch (TIGR) website, www.tigr.org; “Terms associated with TIGRFAMs”.

The term “variant”, as used herein, may refer to polynucleotides orpolypeptides that differ from the presently disclosed polynucleotides orpolypeptides, respectively, in sequence from each other, and as setforth below.

With regard to polynucleotide variants, differences between presentlydisclosed polynucleotides and their variants are limited so that thenucleotide sequences of the former and the latter are closely similaroverall and, in many regions, identical. The degeneracy of the geneticcode dictates that many different variant polynucleotides can encodeidentical and/or substantially similar polypeptides in addition to thosesequences illustrated in the Sequence Listing. Due to this degeneracy,differences between presently disclosed polynucleotides and variantnucleotide sequences may be silent in any given region or over theentire length of the polypeptide (i.e., the amino acids encoded by thepolynucleotide are the same, and the variant polynucleotide sequencethus encodes the same amino acid sequence in that region or entirelength of the presently disclosed polynucleotide. Variant nucleotidesequences may encode different amino acid sequences, in which case suchnucleotide differences will result in amino acid substitutions,additions, deletions, insertions, truncations or fusions with respect tothe similar disclosed polynucleotide sequences. These variations resultin polynucleotide variants encoding polypeptides that share at least onefunctional characteristic (i.e., a presently disclosed transcriptionfactor and a variant will confer at least one of the same functions to aplant).

Within the scope of the invention is a variant of a nucleic acid listedin the Sequence Listing, that is, one having a sequence that differsfrom the one of the polynucleotide sequences in the Sequence Listing, ora complementary sequence, that encodes a functionally equivalentpolypeptide (i.e., a polypeptide having some degree of equivalent orsimilar biological activity) but differs in sequence from the sequencein the Sequence Listing, due to degeneracy in the genetic code.

“Allelic variant” or “polynucleotide allelic variant” refers to any oftwo or more alternative forms of a gene occupying the same chromosomallocus. Allelic variation arises naturally through mutation, and mayresult in phenotypic polymorphism within populations. Gene mutations maybe “silent” or may encode polypeptides having altered amino acidsequences. “Allelic variant” and “polypeptide allelic variant” may alsobe used with respect to polypeptides, and in this case the terms referto a polypeptide encoded by an allelic variant of a gene.

“Splice variant” or “polynucleotide splice variant” as used hereinrefers to alternative forms of RNA transcribed from a gene. Splicevariation naturally occurs as a result of alternative sites beingspliced within a single transcribed RNA molecule or between separatelytranscribed RNA molecules, and may result in several different forms ofmRNA transcribed from the same gene. Thus, splice variants may encodepolypeptides having different amino acid sequences, which, in thepresent context, will have at least one similar function in the organism(splice variation may also give rise to distinct polypeptides havingdifferent functions). “Splice variant” or “polypeptide splice variant”may also refer to a polypeptide encoded by a splice variant of atranscribed mRNA.

As used herein, “polynucleotide variants” may also refer topolynucleotide sequences that encode paralogs and orthologs of thepresently disclosed polypeptide sequences. “Polypeptide variants” mayrefer to polypeptide sequences that are paralogs and orthologs of thepresently disclosed polypeptide sequences.

Differences between presently disclosed polypeptides and polypeptidevariants are limited so that the sequences of the former and the latterare closely similar overall and, in many regions, identical. Presentlydisclosed polypeptide sequences and similar polypeptide variants maydiffer in amino acid sequence by one or more substitutions, additions,deletions, fusions and truncations, which may be present in anycombination. These differences may produce silent changes and result ina functionally equivalent transcription factor. Thus, it will be readilyappreciated by those of skill in the art, that any of a variety ofpolynucleotide sequences is capable of encoding the transcriptionfactors and transcription factor homolog polypeptides of the invention.A polypeptide sequence variant may have “conservative” changes, whereina substituted amino acid has similar structural or chemical properties.Deliberate amino acid substitutions may thus be made on the basis ofsimilarity in polarity, charge, solubility, hydrophobicity,hydrophilicity, and/or the amphipathic nature of the residues, as longas the functional or biological activity of the transcription factor isretained. For example, negatively charged amino acids may includeaspartic acid and glutamic acid, positively charged amino acids mayinclude lysine and arginine, and amino acids with uncharged polar headgroups having similar hydrophilicity values may include leucine,isoleucine, and valine; glycine and alanine; asparagine and glutamine;serine and threonine; and phenylalanine and tyrosine. For more detail onconservative substitutions, see Table 5. More rarely, a variant may have“non-conservative” changes, e.g., replacement of a glycine with atryptophan. Similar minor variations may also include amino aciddeletions or insertions, or both. Related polypeptides may comprise, forexample, additions and/or deletions of one or more N-linked or O-linkedglycosylation sites, or an addition and/or a deletion of one or morecysteine residues. Guidance in determining which and how many amino acidresidues may be substituted, inserted or deleted without abolishingfunctional or biological activity may be found using computer programswell known in the art, for example, DNASTAR software (see U.S. Pat. No.5,840,544).

The term “plant” includes whole plants, shoot vegetativeorgans/structures (e.g., leaves, stems and tubers), roots, flowers andfloral organs/structures (e.g., bracts, sepals, petals, stamens,carpels, anthers and ovules), seed (including embryo, endosperm, andseed coat) and fruit (the mature ovary), plant tissue (e.g., vasculartissue, ground tissue, and the like) and cells (e.g., guard cells, eggcells, and the like), and progeny of same. The class of plants that canbe used in the method of the invention is generally as broad as theclass of higher and lower plants amenable to transformation techniques,including angiosperms (monocotyledonous and dicotyledonous plants),gymnosperms, ferns, horsetails, psilophytes, lycophytes, bryophytes, andmulticellular algae. (See for example, FIG. 1, adapted from Daly et al.(2001) Plant Physiol. 127: 1328-1333; FIG. 2, adapted from Ku et al.(2000) Proc. Natl. Acad. Sci. 97: 9121-9126; and see also Tudge, in TheVariety of Life, Oxford University Press, New York, N.Y. (2000) pp.547-606).

A “transgenic plant” refers to a plant that contains genetic materialnot found in a wild-type plant of the same species, variety or cultivar.The genetic material may include a transgene, an insertional mutagenesisevent (such as by transposon or T-DNA insertional mutagenesis), anactivation tagging sequence, a mutated sequence, a homologousrecombination event or a sequence modified by chimeraplasty. Typically,the foreign genetic material has been introduced into the plant by humanmanipulation, but any method can be used as one of skill in the artrecognizes.

A transgenic plant may contain an expression vector or cassette. Theexpression cassette typically comprises a polypeptide-encoding sequenceoperably linked (i.e., under regulatory control of) to appropriateinducible or constitutive regulatory sequences that allow for theexpression of polypeptide. The expression cassette can be introducedinto a plant by transformation or by breeding after transformation of aparent plant. A plant refers to a whole plant, including seedlings andmature plants, as well as to a plant part, such as seed, fruit, leaf, orroot, plant tissue, plant cells or any other plant material, e.g., aplant explant, as well as to progeny thereof, and to in vitro systemsthat mimic biochemical or cellular components or processes in a cell.

“Fragment”, with respect to a polynucleotide, refers to a clone or anypart of a polynucleotide molecule that retains a usable, functionalcharacteristic. Useful fragments include oligonucleotides andpolynucleotides that may be used in hybridization or amplificationtechnologies or in the regulation of replication, transcription ortranslation. A polynucleotide fragment” refers to any subsequence of apolynucleotide, typically, of at least about 9 consecutive nucleotides,preferably at least about 30 nucleotides, more preferably at least about50 nucleotides, of any of the sequences provided herein. Exemplarypolynucleotide fragments are the first sixty consecutive nucleotides ofthe transcription factor polynucleotides listed in the Sequence Listing.Exemplary fragments also include fragments that comprise a region thatencodes a conserved domain of a transcription factor.

Fragments may also include subsequences of polypeptides and proteinmolecules, or a subsequence of the polypeptide. Fragments may have usesin that they may have antigenic potential. In some cases, the fragmentor domain is a subsequence of the polypeptide that performs at least onebiological function of the intact polypeptide in substantially the samemanner, or to a similar extent, as does the intact polypeptide. Forexample, a polypeptide fragment can comprise a recognizable structuralmotif or functional domain such as a DNA-binding site or domain thatbinds to a DNA promoter region, an activation domain, or a domain forprotein-protein interactions, and may initiate transcription. Fragmentscan vary in size from as few as 3 amino acids to the full length of theintact polypeptide, but are preferably at least about 30 amino acids inlength and more preferably at least about 60 amino acids in length.Exemplary polypeptide fragments are the first twenty consecutive aminoacids of a mammalian protein encoded by are the first twenty consecutiveamino acids of the transcription factor polypeptides listed in theSequence Listing.

Exemplary fragments also include fragments that comprise a conserveddomain of a transcription factor. An example of such an exemplaryfragment would include amino acid residues 20-110 of G481 (SEQ ID NO:88), as noted in Table 8.

The invention also encompasses production of DNA sequences that encodetranscription factors and transcription factor derivatives, or fragmentsthereof, entirely by synthetic chemistry. After production, thesynthetic sequence may be inserted into any of the many availableexpression vectors and cell systems using reagents well known in theart. Moreover, synthetic chemistry may be used to introduce mutationsinto a sequence encoding transcription factors or any fragment thereof.

A “conserved domain” or “conserved region” as used herein refers to aregion in heterologous polynucleotide or polypeptide sequences wherethere is a relatively high degree of sequence identity between thedistinct sequences.

With respect to polynucleotides encoding presently disclosedtranscription factors, a conserved region is preferably at least 10 basepairs (bp) in length.

A “conserved domain”, with respect to presently disclosed polypeptidesrefers to a domain within a transcription factor family that exhibits ahigher degree of sequence homology, such as at least 26% sequencesimilarity, at least 16% sequence identity, preferably at least 40%sequence identity, preferably at least 65% sequence identity includingconservative substitutions, and more preferably at least 80% sequenceidentity, and even more preferably at least 85%, or at least about 86%,or at least about 87%, or at least about 88%, or at least about 90%, orat least about 95%, or at least about 98% amino acid residue sequenceidentity of a polypeptide of consecutive amino acid residues. A fragmentor domain can be referred to as outside a conserved domain, outside aconsensus sequence, or outside a consensus DNA-binding site that isknown to exist or that exists for a particular transcription factorclass, family, or sub-family. In this case, the fragment or domain willnot include the exact amino acids of a consensus sequence or consensusDNA-binding site of a transcription factor class, family or sub-family,or the exact amino acids of a particular transcription factor consensussequence or consensus DNA-binding site. Furthermore, a particularfragment, region, or domain of a polypeptide, or a polynucleotideencoding a polypeptide, can be “outside a conserved domain” if all theamino acids of the fragment, region, or domain fall outside of a definedconserved domain(s) for a polypeptide or protein. Sequences havinglesser degrees of identity but comparable biological activity areconsidered to be equivalents.

As one of ordinary skill in the art recognizes, conserved domains oftranscription factors may be identified as regions or domains ofidentity to a specific consensus sequence (see, for example, Riechmannet al. (2000) supra). Thus, by using alignment methods well known in theart, the conserved domains of the plant transcription factors for eachof the following may be determined: the AP2 (APETALA2) domaintranscription factor family (Riechmann and Meyerowitz (1998) supra; theMYB transcription factor family (ENBib; Martin and Paz-Ares (1997)supra); the MADS domain transcription factor family (Riechmann andMeyerowitz (1997) supra); the WRKY protein family (Ishiguro and Nakamura(1994) supra); the ankyrin-repeat protein family (Zhang et al. (1992)supra); the zinc finger protein (Z) family (Klug and Schwabe (1995)supra; Takatsuji (1998) supra); the homeobox (HB) protein family(Buerglin (1994) supra); the CAAT-element binding proteins (Forsburg andGuarente (1989) supra); the squamosa promoter binding proteins (SPB)(Klein et al. (1996) supra); the NAM protein family (Souer et al. (1996)supra); the IAA/AUX proteins (Abel et al. (1995) supra); the HLH/MYCprotein family (Littlewood et al. (1994) supra); the DNA-binding protein(DBP) family (Tucker et al. (1994) supra); the bZIP family oftranscription factors (Foster et al. (1994) supra); the Box P-bindingprotein (the BPF-1) family (da Costa e Silva et al. (1993) supra); thehigh mobility group (HMG) family (Bustin and Reeves (1996) supra); thescarecrow (SCR) family (Di Laurenzio et al. (1996) supra); the GF14family (Wu et al. (1997) supra); the polycomb (PCOMB) family (Goodrichet al. (1997) supra); the teosinte branched (TEO) family (Luo et al.(1996) supra); the ABI3 family (Giraudat et al. (1992) supra); thetriple helix (TH) family (Dehesh et al. (1990) supra); the EIL family(Chao et al. (1997) Cell supra); the AT-HOOK family (Reeves and Nissen(1990 supra); the S1FA family (Zhou et al. (1995) supra); the bZIPT2family (Lu and Ferl (1995) supra); the YABBY family (Bowman et al.(1999) supra); the PAZ family (Bohmert et al. (1998) supra); a family ofmiscellaneous (MISC) transcription factors including the DPBF family(Kim et al. (1997) supra) and the SPF1 family (Ishiguro and Nakamura(1994) supra); the GARP family (Hall et al. (1998) supra), the TUBBYfamily (Boggin et al. (1999) supra), the heat shock family (Wu (1995supra), the ENBP family (Christiansen et al. (1996) supra), theRING-zinc family (Jensen et al. (1998) supra), the PDBP family (Janik etal. (1989) supra), the PCF family (Cubas et al. (1999) supra), the SRS(SHI-related) family (Fridborg et al. (1999) supra), the CPP(cysteine-rich polycomb-like) family (Cvitanich et al. (2000) supra),the ARF (auxin response factor) family (Ulmasov et al. (1999) supra),the SWI/SNF family (Collingwood et al. (1999) supra), the ACBF family(Seguin et al. (1997) supra), PCGL (CG-1 like) family (da Costa e Silvaet al. (1994) supra) the ARID family (Vazquez et al. (1999) supra), theJumonji family, (Balciunas et al. (2000) supra), the bZIP-NIN family(Schauser et al. (1999) supra), the E2F family Kaelin et al. (1992)supra) and the GRF-like family (Knaap et al (2000) supra).

The conserved domains for each of polypeptides of SEQ ID NO: 2N, whereinN=1-229, are listed in Table 8 as described in Example VII. Also, manyof the polypeptides of Table 8 have conserved domains specificallyindicated by start and stop sites. A comparison of the regions of thepolypeptides in SEQ ID NO: 2N, wherein N=1-229, or of those in Table 8,allows one of skill in the art to identify conserved domain(s) for anyof the polypeptides listed or referred to in this disclosure, includingthose in Tables 7-11.

As used herein, a “gene” is a functional unit of inheritance, and inphysical terms is a particular segment or sequence of nucleotides alonga molecule of DNA (or RNA, in the case of RNA viruses) involved inproducing a functional RNA molecule, such as one used for a structuralor regulatory role, or a polypeptide chain, such as one used for astructural or regulatory role (an example of the latter would betranscription regulation, as by a transcription factor polypeptide).Polypeptides may then be subjected to subsequent processing such assplicing and/or folding to obtain a functional polypeptide. A gene maybe isolated, partially isolated, or be found with an organism's genome.By way of example, a transcription factor gene encodes a transcriptionfactor polypeptide, which may be functional with or without additionalprocessing to function as an initiator of transcription.

Operationally, genes may be defined by the cis-trans test, a genetictest that determines whether two mutations occur in the same gene andwhich may be used to determine the limits of the genetically active unit(Rieger et al. (1976) Glossary of Genetics and Cytogenetics: Classicaland Molecular, 4th ed., Springer Verlag. Berlin). A gene generallyincludes regions preceding (“leaders”; upstream) and following(“trailers”; downstream) of the coding region. A gene may also includeintervening, non-coded sequences, referred to as “introns”, which arelocated between individual coding segments, referred to as “exons”. Mostgenes have an identifiable associated promoter region, a regulatorysequence 5′ or upstream of the transcription initiation codon. Thefunction of a gene may also be regulated by enhancers, operators, andother regulatory elements.

A “trait” refers to a physiological, morphological, biochemical, orphysical characteristic of a plant or particular plant material or cell.In some instances, this characteristic is visible to the human eye, suchas seed or plant size, or can be measured by biochemical techniques,such as detecting the protein, starch, or oil content of seed or leaves,or by observation of a metabolic or physiological process, e.g. bymeasuring uptake of carbon dioxide, or by the observation of theexpression level of a gene or genes, e.g., by employing Northernanalysis, RT-PCR, microarray gene expression assays, or reporter geneexpression systems, or by agricultural observations such as stresstolerance, yield, or pathogen tolerance. Any technique can be used tomeasure the amount of, comparative level of, or difference in anyselected chemical compound or macromolecule in the transgenic plants,however.

“Trait modification” refers to a detectable difference in acharacteristic in a plant ectopically expressing a polynucleotide orpolypeptide of the present invention relative to a plant not doing so,such as a wild-type plant. In some cases, the trait modification can beevaluated quantitatively. For example, the trait modification can entailat least about a 2% increase or decrease in an observed trait(difference), at least a 5% difference, at least about a 10% difference,at least about a 20% difference, at least about a 30%, at least about a50%, at least about a 70%, or at least about a 100%, or an even greaterdifference compared with a wild-type plant. It is known that there canbe a natural variation in the modified trait. Therefore, the traitmodification observed entails a change of the normal distribution of thetrait in the plants compared with the distribution observed in wild-typeplant.

The term “transcript profile” refers to the expression levels of a setof genes in a cell in a particular state, particularly by comparisonwith the expression levels of that same set of genes in a cell of thesame type in a reference state. For example, the transcript profile of aparticular transcription factor in a suspension cell is the expressionlevels of a set of genes in a cell overexpressing that transcriptionfactor compared with the expression levels of that same set of genes ina suspension cell that has normal levels of that transcription factor.The transcript profile can be presented as a list of those genes whoseexpression level is significantly different between the two treatments,and the difference ratios. Differences and similarities betweenexpression levels may also be evaluated and calculated using statisticaland clustering methods.

“Wild type”, as used herein, refers to a cell, tissue or plant that hasnot been genetically modified to knock out or overexpress one or more ofthe presently disclosed transcription factors. Wild-type cells, tissueor plants may be used as controls to compare levels of expression andthe extent and nature of trait modification with modified (e.g.,transgenic) cells, tissue or plants in which transcription factorexpression is altered or ectopically expressed by, for example, knockingout or overexpressing a gene.

“Ectopic expression” or “altered expression” in reference to apolynucleotide indicates that the pattern of expression in, e.g., atransgenic plant or plant tissue, is different from the expressionpattern in a wild-type plant or a reference plant of the same species.The pattern of expression may also be compared with a referenceexpression pattern in a wild-type plant of the same species. Forexample, the polynucleotide or polypeptide is expressed in a cell ortissue type other than a cell or tissue type in which the sequence isexpressed in the wild-type plant, or by expression at a time other thanat the time the sequence is expressed in the wild-type plant, or by aresponse to different inducible agents, such as hormones orenvironmental signals, or at different expression levels (either higheror lower) compared with those found in a wild-type plant. Alteredexpression may be achieved by, for example, transformation of a plantwith an expression cassette having a constitutive or inducible promoterelement associated with a transcription factor gene. The resultingexpression pattern can thus constitutive or inducible, and be stable ortransient. Altered or ectopic expression may also refer to alteredexpression patterns that are produced by lowering the levels ofexpression to below the detection level or completely abolishingexpression by, for example, knocking out a gene's expression bydisrupting expression or regulation of the gene with an insertionelement.

In reference to a polypeptide, the term “ectopic expression or alteredexpression” further may relate to altered activity levels resulting fromthe interactions of the polypeptides with exogenous or endogenousmodulators or from interactions with factors or as a result of thechemical modification of the polypeptides.

The term “overexpression” as used herein refers to a greater expressionlevel of a gene in a plant, plant cell or plant tissue, compared toexpression in a wild-type plant, cell or tissue, at any developmental ortemporal stage for the gene. Overexpression can occur when, for example,the genes encoding one or more transcription factors are under thecontrol of a strong expression signal, such as one of the promotersdescribed herein (e.g., the cauliflower mosaic virus 35S transcriptioninitiation region). Overexpression may occur throughout a plant or inspecific tissues of the plant, depending on the promoter used, asdescribed below.

Overexpression may take place in plant cells normally lacking expressionof polypeptides functionally equivalent or identical to the presenttranscription factors. Overexpression may also occur in plant cellswhere endogenous expression of the present transcription factors orfunctionally equivalent molecules normally occurs, but such normalexpression is at a lower level than in the organism or tissues of theoverexpressor. Overexpression thus results in a greater than normalproduction, or “overproduction” of the transcription factor in theplant, cell or tissue.

The term “phase change” refers to a plant's progression from embryo toadult, and, by some definitions, the transition wherein flowering plantsgain reproductive competency. It is believed that phase change occurseither after a certain number of cell divisions in the shoot apex of adeveloping plant, or when the shoot apex achieves a particular distancefrom the roots. Thus, altering the timing of phase changes may affect aplant's size, which, in turn, may affect yield and biomass.

Traits that May be Modified in Overexpressing or Knock-Out Plants

Trait modifications of particular interest include those to seed (suchas embryo or endosperm), fruit, root, flower, leaf, stem, shoot,seedling or the like, including: enhanced tolerance to environmentalconditions including freezing, chilling, heat, drought, watersaturation, radiation and ozone; improved tolerance to microbial, fungalor viral diseases; improved tolerance to pest infestations, includinginsects, nematodes, mollicutes, parasitic higher plants or the like;decreased herbicide sensitivity; improved tolerance of heavy metals orenhanced ability to take up heavy metals; improved growth under poorphotoconditions (e.g., low light and/or short day length), or changes inexpression levels of genes of interest. Other phenotype that can bemodified relate to the production of plant metabolites, such asvariations in the production of taxol, tocopherol, tocotrienol, sterols,phytosterols, vitamins, wax monomers, anti-oxidants, amino acids,lignins, cellulose, tannins, prenyllipids (such as chlorophylls andcarotenoids), glucosinolates, and terpenoids, enhanced orcompositionally altered protein or oil production (especially in seeds),or modified sugar (insoluble or soluble) and/or starch composition.Physical plant characteristics that can be modified include celldevelopment (such as the number of trichomes), fruit and seed size andnumber, yields of plant parts such as stems, leaves, inflorescences, androots, the stability of the seeds during storage, characteristics of theseed pod (e.g., susceptibility to shattering), root hair length andquantity, internode distances, or the quality of seed coat. Plant growthcharacteristics that can be modified include growth rate, germinationrate of seeds, vigor of plants and seedlings, leaf and flowersenescence, male sterility, apomixis, flowering time, flower abscission,rate of nitrogen uptake, osmotic sensitivity to soluble sugarconcentrations, biomass or transpiration characteristics, as well asplant architecture characteristics such as apical dominance, branchingpatterns, number of organs, organ identity, organ shape or size.

Transcription Factors Modify Expression of Endogenous Genes

Expression of genes that encode transcription factors that modifyexpression of endogenous genes, polynucleotides, and proteins are wellknown in the art. In addition, transgenic plants comprising isolatedpolynucleotides encoding transcription factors may also modifyexpression of endogenous genes, polynucleotides, and proteins. Examplesinclude Peng et al. (1997) Genes and Development 11: 3194-3205, and Penget al. (1999) Nature 400: 256-261. In addition, many others havedemonstrated that an Arabidopsis transcription factor expressed in anexogenous plant species elicits the same or very similar phenotypicresponse. See, for example, Fu et al. (2001) Plant Cell 13: 1791-1802;Nandi et al. (2000, Curr. Biol. 10: 215-218; Coupland (1995) Nature 377:482-483; and Weigel and Nilsson (1995) Nature 377: 482-500.

In another example, Mandel et al. (1992) Cell 71-133-143 and Suzulk etal. (2001) Plant J. 28: 409-418, teach that a transcription factorexpressed in another plant species elicits the same or very similarphenotypic response of the endogenous sequence, as often predicted inearlier studies of Arabidopsis transcription factors in Arabidopsis (seeMandel et al. (1992) supra; Suzuki et al. (2001) supra).

Other examples include Müller et al. (2001) Plant J. 28: 169-179; Kim etal. (2001) Plant J. 25: 247-259; Kyozuka and Shimamoto (2002) Plant CellPhysiol. 43: 130-135; Boss and Thomas (2002) Nature 416: 847-850; He etal. (2000) Transgenic Res. 9: 223-227; and Robson et al. (2001) Plant J.28: 619-631.

In yet another example, Gilmour et al. (1998) Plant J. 16: 433-442,teach an Arabidopsis AP2 transcription factor, CBF1 (SEQ ID NO: 1956),which, when overexpressed in transgenic plants, increases plant freezingtolerance. Jaglo et al. (2001) Plant Physiol. 127: 910-917, furtheridentified sequences in Brassica napus which encode CBF-like genes andthat transcripts for these genes accumulated rapidly in response to lowtemperature. Transcripts encoding CBF-like proteins were also found toaccumulate rapidly in response to low temperature in wheat, as well asin tomato. An alignment of the CBF proteins from Arabidopsis, B. napus,wheat, rye, and tomato revealed the presence of conserved consecutiveamino acid residues, PKK/RPAGRxKFxETRHP and DSAWR, that bracket theAP2/EREBP DNA binding domains of the proteins and distinguish them fromother members of the AP2/EREBP protein family. (See Jaglo et al. supra).

Gao et al. (2002) Plant Molec. Biol. 49: 459-471) have recentlydescribed four CBF transcription factors from Brassica napus: BNCBFs 5,7, 16 and 17. They note that the first three CBFs (GenBank AccessionNumbers AAM18958, AAM18959, and AAM18960, respectively) are very similarto Arabidopsis CBF1, whereas BNCBF17 (GenBank Accession Number AAM18961)is similar but contains two extra regions of 16 and 21 amino acids inits acidic activation domain. All four B. napus CBFs accumulate inleaves of the plants after cold-treatment, and BNCBFs 5, 7, 16accumulated after salt stress treatment. The authors concluded thatthese BNCBFs likely function in low-temperature responses in B. napus.

In a functional study of CBF genes, Hsieh et al. ((2002) Plant Physiol.129: 1086-1094) found that heterologous expression of Arabidopsis CBF1in tomato plants confers increased tolerance to chilling andconsiderable tolerance to oxidative stress, which suggested to theauthors that ectopic Arabidopsis CBF1 expression may induce severaltomato stress responsive genes to protect the plants.

Transcription factors mediate cellular responses and control traitsthrough altered expression of genes containing cis-acting nucleotidesequences that are targets of the introduced transcription factor. It iswell appreciated in the Art that the effect of a transcription factor oncellular responses or a cellular trait is determined by the particulargenes whose expression is either directly or indirectly (e.g., by acascade of transcription factor binding events and transcriptionalchanges) altered by transcription factor binding. In a global analysisof transcription comparing a standard condition with one in which atranscription factor is overexpressed, the resulting transcript profileassociated with transcription factor overexpression is related to thetrait or cellular process controlled by that transcription factor. Forexample, the PAP2 gene (and other genes in the MYB family) have beenshown to control anthocyanin biosynthesis through regulation of theexpression of genes known to be involved in the anthocyanin biosyntheticpathway (Bruce et al. (2000) Plant Cell 12: 65-79; and Borevitz et al.(2000) Plant Cell 12: 2383-2393). Further, global transcript profileshave been used successfully as diagnostic tools for specific cellularstates (e.g., cancerous vs. non-cancerous; Bhattacharjee et al. (2001)Proc. Natl. Acad. Sci. USA 98: 13790-13795; and Xu et al. (2001) ProcNatl Acad Sci, USA 98: 15089-15094). Consequently, it is evident to oneskilled in the art that similarity of transcript profile uponoverexpression of different transcription factors would indicatesimilarity of transcription factor function.

Polypeptides and Polynucleotides of the Invention

The present invention provides, among other things, transcriptionfactors (TFs), and transcription factor homolog polypeptides, andisolated or recombinant polynucleotides encoding the polypeptides, ornovel sequence variant polypeptides or polynucleotides encoding novelvariants of transcription factors derived from the specific sequencesprovided here. These polypeptides and polynucleotides may be employed tomodify a plant's characteristics.

Exemplary polynucleotides encoding the polypeptides of the inventionwere identified in the Arabidopsis thaliana GenBank database usingpublicly available sequence analysis programs and parameters. Sequencesinitially identified were then further characterized to identifysequences comprising specified sequence strings corresponding tosequence motifs present in families of known transcription factors. Inaddition, further exemplary polynucleotides encoding the polypeptides ofthe invention were identified in the plant GenBank database usingpublicly available sequence analysis programs and parameters. Sequencesinitially identified were then further characterized to identifysequences comprising specified sequence strings corresponding tosequence motifs present in families of known transcription factors.Polynucleotide sequences meeting such criteria were confirmed astranscription factors.

Additional polynucleotides of the invention were identified by screeningArabidopsis thaliana and/or other plant cDNA libraries with probescorresponding to known transcription factors under low stringencyhybridization conditions. Additional sequences, including full lengthcoding sequences were subsequently recovered by the rapid amplificationof cDNA ends (RACE) procedure, using a commercially available kitaccording to the manufacturer's instructions. Where necessary, multiplerounds of RACE are performed to isolate 5′ and 3′ ends. The full-lengthcDNA was then recovered by a routine end-to-end polymerase chainreaction (PCR) using primers specific to the isolated 5′ and 3′ ends.Exemplary sequences are provided in the Sequence Listing.

The polynucleotides of the invention can be or were ectopicallyexpressed in overexpressor or knockout plants and the changes in thecharacteristic(s) or trait(s) of the plants observed. Therefore, thepolynucleotides and polypeptides can be employed to improve thecharacteristics of plants.

The polynucleotides of the invention can be or were ectopicallyexpressed in overexpressor plant cells and the changes in the expressionlevels of a number of genes, polynucleotides, and/or proteins of theplant cells observed. Therefore, the polynucleotides and polypeptidescan be employed to change expression levels of a genes, polynucleotides,and/or proteins of plants.

Specific examples of the polypeptides and polynucleotides of theinvention and experimental observations made after modifying theirexpression in plants may be found in the following text and tables.

CCAAT-Element Binding Protein Transcription Factor Family

G481 (AT2G38880) and G482 are members of the HAP3 sub-group of the CCAATbinding factor family (CAAT). G481, G481 and their related sequenceswere included a program to test the ability of these sequences to conferthe same drought-related abiotic stress previously observed by us in35S::G481 lines.

The CAAT family of transcription factors, also be referred to as the“CCAAT” or “CCAAT-box” family, are characterized by their ability tobind to the CCAAT-box element located 80 to 300 bp 5′ from atranscription start site (Gelinas et al. (1985) Nature 313: 323-325).The CCAAT-box is a conserved cis-acting regulatory element with theconsensus sequence CCAAT that is found in the promoters of genes fromall eukaryotic species. The element can act in either orientation, aloneor as multimeric regions with possible cooperation with other cisregulatory elements (Tasanen et al. (1992) (J. Biol. Chem. 267:11513-11519). It has been estimated that 25% of eukaryotic promotersharbor this element (Bucher (1988) J. Biomol. Struct. Dyn. 5:1231-1236). CCAAT-box elements have been shown to function in theregulation of gene expression in plants (Rieping and Schoffl (1992) Mol.Gen. Genet. 231: 226-232; Kehoe et al. (1994) Plant Cell 6: 1123-1134;Ito et al. (1995) Plant Cell Physiol. 36: 1281-1289). Several reportshave described the importance of the CCAAT-binding element for regulatedexpression; including the regulation of genes that are responsive tolight (Kusnetsov et al. (1999) J. Biol. Chem. 274: 36009-36014; Carreand Kay (1995) Plant Cell 7: 2039-2051) as well as stress (Rieping andSchoffl (1992) supra). Specifically, a CCAAT-box motif was shown to beimportant for the light regulated expression of the CAB2 promoter inArabidopsis, however, the proteins that bind to the site were notidentified (Carre and Kay (1995) supra). To date, no specificArabidopsis CCAAT-box binding protein has been functionally associatedwith its corresponding target genes. In October of 2002 at an EPSOmeeting on Plant Networks, a seminar was given by Detlef Weigel(Tuebingen) on the control of the AGAMOUS (a floral organ identity gene)gene in Arabidopsis. In order to find important cis-elements thatregulate AGAMOUS activity, he aligned the promoter regions from 29different Brassicaceae species and showed that there were two highlyconserved regions; one well characterized site that binds LEAFY/WUSheterodimers and another putative CCAAT-box binding motif. We havediscovered several CCAAT-box genes that regulate flowering time and arecandidates for binding to the AGAMOUS promoter. One of these genes,G485, is a HAP3-like protein that is closely related to G481. Gain offunction and loss of function studies on G485 reveal opposing effects onflowering time, indicating that the gene is both sufficient to act as afloral activator, and is also necessary in that role within the plant.

The first proteins identified that bind to the CCAAT-box element wereidentified in yeast. The CCAAT-box transcription factors bind ashetero-tetrameric complex called the HAP complex (heme activator proteincomplex) or the CCAAT binding factor (Forsburg and Guarente (1988) Mol.Cell Biol. 8: 647-654). The HAP complex in yeast is composed of at leastfour subunits, HAP2, HAP3, HAP4 and HAP5. In addition, the proteins thatmake up the HAP2, 3, 4, 5 complex are represented by single genes. Theirfunction is specific for the activation of genes involved inmitochondrial biogenesis and energy metabolism (Dang et al. (1996) Mol.Microbiol. 22:681-692). In mammals, the CCAAT binding factor is atrimeric complex consisting of NF-YA (HAP2-like), NF-YB (HAP3-like) andNF-YC (HAP5-like) subunits (Maity and de Crombrugghe (1998) TrendsBiochem. Sci. 23: 174-178). In plants, analogous members of the CCAATbinding factor complex are represented by small gene families, and it islikely that these genes play a more complex role in regulating genetranscription. In Arabidopsis there are ten members of the HAP2subfamily, ten members of the HAP3 subfamily, thirteen members of theHAP5 subfamily. Plants and mammals, however, do not appear to have aprotein equivalent of HAP4 of yeast. HAP4 is not required for DNAbinding in yeast although it provides the primary activation domain forthe complex (McNabb et al. (1995) Genes Dev. 9: 47-58; Olesen andGuarente (1990) Genes Dev. 4, 1714-1729).

In mammals, the CCAAT-box element is found in the promoters of manygenes and it is therefore been proposed that CCAAT binding factors serveas general transcriptional regulators that influence the frequency oftranscriptional initiation (Maity and de Crombrugghe (1998) supra).CCAAT binding factors, however, can serve to regulate target promotersin response to environmental cues and it has been demonstrated thatassembly of CCAAT binding factors on target promoters occurs in responseto a variety signals (Myers et al. (1986) Myers et al. (1986) Science232: 613-618; Maity and de Crombrugghe (1998) supra; Bezhani et al.(2001) J. Biol. Chem. 276: 23785-23789). Mammalian CP1 and NF-Y are bothheterotrimeric CCAAT binding factor complexes (Johnson and McKnight(1989) Ann. Rev. Biochem. 58: 799-839. Plant CCAAT binding factors areassumed to be trimeric, as is the case in mammals, however, they couldassociate with other transcription factors on target promoters as partof a larger complex. The CCAAT box is generally found in close proximityof other promoter elements and it is generally accepted that the CCAATbinding factor functions synergistically with other transcriptionfactors in the regulation of transcription. In addition, it has recentlybeen shown that a HAP3-like protein from rice, OsNF-YB1, interacts witha MADS-box protein OsMADS18 in vitro (Masiero et al. (2002) J. Biol.Chem. 277: 26429-26435). It was also shown that the in vitro ternarycomplex between these two types of transcription factors requires thatboth; OsNF-YB1 form a dimer with a HAP5-like protein, and that OsMADS18form a heterodimer with another MADS-box protein. Interestingly, theOsNF-YB1/HAP5 protein dimer is incapable of interacting with HAP2-likesubunits and therefore cannot bind the CCAAT element. The authorstherefore speculate that there is a select set of HAP3-like proteins inplants that act on non-CCAAT promoter elements by virtue of theirinteraction with other non-CCAAT transcription factors (Masiero et al.(2002) supra). In support of this, HAP3/HAP5 subunit dimers have beenshown to be able to interact with TFIID in the absence of HAP2 subunits(Romier et al. (2003) J. Biol. Chem. 278: 1336-1345).

The CCAAT-box motif is found in the promoters of a variety of plantgenes. In addition, the expression pattern of many of the HAP-like genesin Arabidopsis shows developmental regulation. We have used RT-PCR toanalyze the endogenous expression of 31 of the 34 CCAAT-box proteins.Our findings suggest that while most of the CCAAT-box gene transcriptsare found ubiquitously throughout the plant, in more than half of thecases, the genes are predominantly expressed in flower, embryo and/orsilique tissues. Cell-type specific localization of the CCAAT genes inArabidopsis would be very informative and could help determine theactivity of various CCAAT genes in the plant.

Genetic analysis has determined the function of one Arabidopsis CCAATgene, LEAFY COTYLEDON (LEC1). LEC1 is a HAP3 subunit gene thataccumulates only during seed development. Arabidopsis plants carrying amutation in the LEC1 gene display embryos that are intolerant todesiccation and that show defects in seed maturation (Lotan et al.(1998) Cell 93: 1195-1205). This phenotype can be rescued if the embryosare allowed to grow before the desiccation process occurs during normalseed maturation. This result suggests LEC1 has a role in allowing theembryo to survive desiccation during seed maturation. The mutant plantsalso possess trichomes, or epidermal hairs on their cotyledons, acharacteristic that is normally restricted to adult tissues like leavesand stems. Such an effect suggests that LEC1 also plays a role inspecifying embryonic organ identity. In addition to the mutant analysis,the ectopic expression (unregulated overexpression) of the wild typeLEC1 gene induces embryonic programs and embryo development invegetative cells consistent with its role in coordinating higher plantembryo development. The ortholog of LEC1 has been identified recently inmaize. The expression pattern of ZmLEC1 in maize during somatic embryodevelopment is similar to that of LEC1 in Arabidopsis during zygoticembryo development (Zhang et al. (2002) Planta 215:191-194).

Matching the CCAAT transcription factors with target promoters and theanalysis of the knockout and overexpression mutant phenotypes will helpsort out whether these proteins act specifically or non-specifically inthe control of plant pathways. The fact that CCAAT-box elements are notpresent in most plant promoters suggests that plant CCAAT bindingfactors most likely do not function as general components of thetranscriptional machinery. In addition, the very specific role of theLEC1 protein in plant developmental processes supports the idea thatCCAAT-box binding complexes play very specific roles in plant growth anddevelopment.

The Domain Structure of CCAAT-Element Binding Transcription Factors andNovel Conserved Domains in Arabidopsis and Other Species

Plant CCAAT binding factors potentially bind DNA as heterotrimerscomposed of HAP2-like, HAP3-like and HAP5-like subunits. All subunitscontain regions that are required for DNA binding and subunitassociation. The subunit proteins appear to lack activation domains;therefore, that function must come from proteins with which theyinteract on target promoters. No proteins that provide the activationdomain function for CCAAT binding factors have been identified inplants. In yeast, however, the HAP4 protein provides the primaryactivation domain (McNabb et al. (1995) Genes Dev. 9: 47-58; Olesen andGuarente (1990) Genes Dev. 4, 1714-1729).

HAP2-, HAP3- and HAP5-like proteins have two highly conservedsub-domains, one that functions in subunit interaction and the otherthat acts in a direct association with DNA. Outside these two regions,non-paralogous Arabidopsis HAP-like proteins are quite divergent insequence and in overall length.

The general domain structure of HAP3 proteins is found in FIG. 9. HAP3proteins contain an amino-terminal A domain, a central B domain and acarboxy-terminal C domain. There is very little sequence similaritybetween HAP3 proteins in the A and C domains; it is therefore reasonableto assume that the A and C domains could provide a degree of functionalspecificity to each member of the HAP3 subfamily. The B domain is theconserved region that specifies DNA binding and subunit association.

In FIGS. 10A-10F, HAP3 proteins from Arabidopsis, soybean, rice and cornare aligned with G481, with the A, B and C domains and the DNA bindingand subunit interaction domains indicated. As can be seen in FIG.10B-10C, the B domain of the non-LEC1-like clade (identified in FIGS. 6and 7) may be distinguished by the amino acid residues:

Ser/Gly-Arg-Ile/Leu-Met-Lys-(Xaa)₂-Lys/Ile/Val-Pro-Xaa-Asn-Ala/Gly-Lys-Ile/Val-Ser/Ala/Gly-Lys-Asp/Glu-Ala/Ser-Lys-Glu/Asp/Gln-Thr/Ile-Xaa-Gln-Glu-Cys-Val/Ala-Ser/Thr-Glu-Phe-Ile-Ser-Phe-Ile/Val/His-Thr/Ser-[Pro]-Gly/Ser/Cys-Glu-Ala/Leu-Ser/Ala-Asp/Glu/Gly-Lys/Glu-Cys-Gln/His-Arg/Lys-Glu-Lys/Asn-Arg-Lys-Thr-Ile/Val-Asn-Gly-Asp/Glu-Asp-Leu/Ile-Xaa-Trp/Phe-Ala-Met/Ile/Leu-Xaa-Thr/Asn-Leu-Gly-Phe/Leu-Glu/Asp-Xaa-Tyr-(Xaa)₂-Pro/Gln/Ala-Leu/Val-Lys/Gly;

where Xaa can be any amino acid. The proline residue that appears inbrackets is an additional residue that was found in only one sequence(not shown in FIG. 10B). The boldfaced residues that appear here and inthe consensus sequences of FIGS. 10B-10C in their present positions areuniquely found in the non-LEC1-like clade, and may be used to identifymembers of this clade. The G482-like subclade may be delineated by theunderlined serine residue in its present position here and in theconsensus sequence of FIGS. 10B-10C. More generally, the non-LEC1-likeclade is distinguished by a B domain comprising:Asn-(Xaa)₄-Lys-(Xaa)₃₃₋₃₄-Asn-Gly;

and the G482 subclade is distinguished by a B-domain comprising:Ser-(Xaa)₉-Asn-(Xaa)₄-Lys-(Xaa)₃₃₋₃₄-Asn-Gly.

Overexpression of these polypeptides confers increased abiotic stresstolerance in a transgenic plant, as compared to a non-transformed plantthat does not overexpress the polypeptide.

The CCAAT Family Members Under Study

G481, G482 and G485 (polynucleotide SEQ ID NOs: 87, 89 and 2009) werechosen for study based on observations that Arabidopsis plantsoverexpressing these genes had resistance to abiotic stresses, such asosmotic stress, and including drought-related stress (see Example XIII,below). G481, G482 and G485 are members of the CCAAT family, proteinsthat act in a multi-subunit complex and are believed to bind CCAAT boxesin promoters of target genes as trimers or tetramers.

In Arabidopsis, three types of CCAAT binding proteins exist: HAP2, HAP3and HAP5. The G481, G482 and G485 polypeptides, as well as a number ofother proteins in the Arabidopsis proteome, belong to the HAP3 class. Asreported in the scientific literature thus far, only two genes from theHAP3 class have been functionally analyzed to a substantial degree.These are LEAFY COTYLEDON1 (LEC1) and its most closely related subunit,LEC1-LIKE (L1L). LEC1 and L1L are expressed primarily during seeddevelopment. Both appear to be essential for embryo survival ofdesiccation during seed maturation (Kwong et al. (2003) Plant Cell 15:5-18). LEC1 is a critical regulator required for normal developmentduring the early and late phases of embryogenesis that is sufficient toinduce embryonic development in vegetative cells. Kwong et al. showedthat ten Arabidopsis HAP3 subunits can be divided into two classes basedon sequence identity in their central, conserved B domain. LEC1 and L1Lconstitute LEC1-type HAP3 subunits, whereas the remaining HAP3 subunitswere designated non-LEC1-type.

Phylogenetic trees based on sequential relatedness of the HAP3 genes areshown in FIGS. 6 and 7. As can be seen in these figures showing theL1L-related CCAAT transcription factor family, G1364 and G2345 areclosely related to G481, and G482 and G485 are more related to G481 thaneither LEC1 or L1L, the latter two sequences being found on somewhatmore distant nodes. The present invention encompasses the G482 subcladeof these non-LEC1-like clade of proteins of the L1L-related CCAATtranscription factor family, for which a representative number ofmonocot and dicot species, including members from dicot and monocotspecies (for example, Arabidopsis sequences G481, G482, G485, G1364 andG2345, soy sequences G3472, G3475, G3476, rice sequences G3395, G3397,G3398, and corn sequences G3434, and G3436), have been shown to conferimproved abiotic stress tolerance in plants when overexpressed.

Table 1 shows the polypeptides identified by SEQ ID NO; Gene ID (GID)No.; the transcription factor family to which the polypeptide belongs,and conserved B domains of the polypeptide. The first column shows thepolypeptide SEQ ID NO; the second column the species (abbreviated) andidentifier (GID or “Gene IDentifier); the third column shows the Bdomain; the fourth column shows the amino acid coordinates of theconserved domain which were used to determine percentage identity toG481; and the fifth column shows the percentage identity to G481. Thesequences are arranged in descending order of percentage identity toG481.

For Tables 1 and 2, homology was determined after aligning the sequencesusing the methods of Smith and Waterman (1981) Adv. Appl. Math. 2:482-489. After alignment, sequence comparisons between the polypeptideswere performed by comparison over a comparison window to identify andcompare local regions of sequence similarity. A description of themethod is provided in Ausubel et al. (eds.) Current Protocols inMolecular Biology, John Wiley & Sons (1998, and supplements through2001), Altschul et al. (1990). J. Mol. Biol. 215: 403-410; and Gish andStates (1993) Nature Genetics 3: 266-72. The percentage identityreported in these tables is based on the comparison within thesewindows.

TABLE 1 Gene families and B domains Species/ GID No., Amino Acid % ID toCCAAT-box Accession Coordinates for binding conserved Polypeptide No.,or Percent Identity domain of G481 SEQ ID NO: Identifier B DomainDetermination (SEQ ID NO: 88) 88 At/G481 REQDRYLPIANISRIMKKALPPNGKI20-110 100%  GKDAKDTVQECVSEFISFITSEASDK CQKEKRKTVNGDDLLWAMATLGFEDYLEPLKIYLARYRE 806 Zm/G3434 REQDRFLPIANISRIMKKAVPANGKI 18-108 85%AKDAKETLQECVSEFISFVTSEASDK CQKEKRKTINGDDLLWAMATLGFEEY VEPLKIYLQKYKE 2102At/G1364 REQDRFLPIANISRIMKRGLPANGKI 29-119 85%AKDAKEIVQECVSEFISFVTSEASDK CQREKRKTINGDDLLWAMATLGFEDY MEPLKVYLMRYRE 800Gm/G3475 REQDRFLPIANVSRIMKKALPANAKI 23-113 84%SKDAKETVQECVSEFISFITGEASDK CQREKRKTINGDDLLWAMTTLGFEDY VEPLKGYLQRFRE 2010At/G485 REQDRFLPIANVSRIMKKALPANAKI 20-110 84% SKDAKETVQECVSEFISFITGEASDKCQREKRKTINGDDLLWAMTTLGFEDY VEPLKVYLQKYRE 798 Gm/G3476REQDRFLPIANVSRIMKKALPANAKI 26-116 84% SKDAKETVQECYSEFISFITGEASDKCQREKRKTINGDDLLWAMTTLGFEEY VEPLKIYLQRFRE 2172 At/G2345REQDRFLPIANISRIMKRGLPLNGKI 28-118 84% AKDAKETMQECVSEFISFVTSEASDKCQREKRFTINGDDLLWAMATLGFEDY IDPLKVYLMRYRE 90 At/G482REQDRFLPIANVSRIMKKALPANAKI 26-116 83% SKDAKETMQECVSEFISFVTGEASDKCQKEKRKTINGDDLLWAMTTLGFEDY VEPLKVYLQRFRE 801 Gm/G3472REQDRFLPIANVSRIMKKALPANAKI 25-115 83% SKEAKETVQECVSEFISFITGEASDKCQKEKRKTINGDDLLWAMTTLGFEEY VEPLKVYLHKYRE 805 Zm/G3436REQDRFLPIANVSRIMKKALPANAKI 20-110 83% SKDAKETVQECVSEFISFITGEASDKCQREKRKTINGDDLLWAMTTLGFEDY VEPLKLYLHKFRE 794 Os/G3397REQDRFLPIANVSRIMKKALPANAKI 23-113 82% SKDAKETVQECVSEFISFITGEASDKCQREKRKTINGDDLLWAMTTLGFEDY VDPLKHYLHKFRE 790 Os/G3395REQDRFLPIANISRIMKKAVPANGKI 19-109 82% AKDAXETLQECVSEFISFVTSEASDKCQKEKRKTINGEDLLFAMGTLGFEEY VDPLKIYLHKYRE 796 Os/G3398REQDRFLPIANVSRIMKRALPANAKI 21-111 81% SKDAKETVQECVSEFISFITGEASDKCQREKRKTINGDDLLWAMTTLGFEDY IDPLKLYLHKFRE At/G1821REQDRFMPIANVIRIMRRILPAHAKI 28-118 62% L1L SDDSKETIQECVSEYISFITGEANERNP_199578 CQREQRKTITAEDVLWAMSKLGFDDY IEPLTLYLHRYRE At/REQDQYMPIANVIRIMRKTLPSHAKI 28-118 62% AAC39488SDDAKETIQECVSEYISFVTGEANER LEC1 CQREQRKTITAEDILWAMSKLGFDNY VDPLTVFINRYREAbbreviations At Arabidopsis thaliana Gm Glycine max Os Oryza sativa ZmZea maysG682 and the MYB-Related Transcription Factors

G682 is a member of the MYB-related group of transcription factors. G682and its related sequences were included a program to test the ability ofthese sequences to confer the same drought-related abiotic stresspreviously observed by us in 35S::G82 lines.

We first identified G682 as a putative transcription factor from theArabidopsis BAC, AF007269, based on sequence similarity to other membersof the MYB-related family within the conserved domain. To date, nofunctional data are available for this gene in the literature. The genecorresponds to At4G01060, annotated by the Arabidopsis Genomeinitiative. G682 is one of a 5-member clade of related proteins thatrange in size from 75 to 112 amino acids. These proteins contain asingle MYB repeat, which is not uncommon for plant MYB transcriptionfactors. Information on gene function has been published for two of thegenes in this lade, CAPRICE (CPC/G225) and TRIPTYCHON (TRY/G1816).Published information on gene function is not available for twoadditional members of the clade; G226 and G2718. our initial genomicsprogram, members of the G682 lade were found to promote epidermal celltype alterations when overexpressed in Arabidopsis. These changesinclude both increased numbers of root hairs compared to wild typeplants, as well as a reduction in trichome number. In addition,overexpression lines for all members of the lade showed a reduction inanthocyanin accumulation in response to stress, and enhanced toleranceto osmotic stress. In the case of 35S::G682 transgenic lines, anenhanced tolerance to high heat conditions was also observed. Given thephenotypic responses for G682 and its clade members, a sizeable numberof clade members were included in the present drought-stress study.Table 2 summarizes the functional genomics data found in ourinvestigation of G682 and its lade members.

TABLE 2 G682-clade experimental observations GID TRY Observation G226G682 (G1816) G2718 Reduction in Trichome # X X X X Increased Root Hair #X X X X N Tolerance X X X Heat Tolerance X Salt Tolerance X Sugarresponse X

There are approximately 50 members of this family in Arabidopsis. TheMYB-related DNA-binding domain contains approximately 50 amino acidswith a series of highly conserved residues arranged with acharacteristic spacing. The single-repeat MYB proteins do not contain atypical transcriptional activation domain and this suggests that theymay function by interfering with the formation or activity oftranscription factors or transcription factor complexes (Wada et al.(1997) Science 277: 1113-1116; Schellmann et al. (2002) EMBO J. 21:5036-5046). In addition to the G682 clade, two well characterizedtranscription factors, CIRCADIAN CLOCK ASSOCIATED1 (CCA1/G214) and LATEELONGATED HYPOCOTYL (LHY/G680) represent two additionalwell-characterized MYB-related proteins that contain single MYB repeats(Wang et al. (1997) Plant Cell 9: 491-507; Schaffer et al. (1998) Cell93: 1219-1229).

The difference in the phenotypic responses of the G682-cladeoverexpression lines (Table 2), along with the differences in the CPC(G225) and TRY (G1816) mutant phenotypes (Schellmann et al. (2002)supra), suggest that each of the five Arabidopsis genes in the cladehave distinct but overlapping functions in the plant. In the case of35S::G682 transgenic lines, an enhanced tolerance to high heatconditions was observed. Heat can cause osmotic stress, and it istherefore consistent that these transgenic lines were also more tolerantto drought stress in a soil-based assay. Another common feature for fourof five members of this clade is that they enhance performance undernitrogen-limiting conditions. 35S::G682 plants lacked this feature inthe first round of screening, but given the high throughput nature ofthe genomics program, it is possible this phenotype would have beenobserved if a greater number of lines had been examined.

All of the genes in the Arabidopsis G682 clade reduced trichomes andincreased root hairs when constitutively overexpressed (Table 2). It isunknown, however, whether the drought-tolerance phenotypes in theselines is related to the increase in root hairs on the root epidermis.Increasing root hair density may increase in absorptive surface area andincrease in nitrate transporters that are normally found there.Alternatively, the wer, ttg1 and gl2 mutations, all of which increaseroot hair frequency, and have also been shown to cause ectopic stomateformation on the epidermis of hypocotyls. Thus, it is possible that CPCand TRY could be involved in the development, or regulation, of stomatesas well (Hung et al. (1998) Plant Physiol. 117: 73-84, 1998; Berger etal. (1998) Curr. Biol. 8: 421-430; Lee and Schiefelbein (1999) Cell 99:473-483). The CPC (G225) and TRY (G1816) proteins have not been reportedto alter hypocotyl epidermal cell fate; however, ectopic expression mayhave resulted in an alteration in stomate production in this tissue.Since alterations in stomate production could alter plant water status,it should be examined in transgenics expressing the genes of the G682clade, particularly in lines that show increased drought tolerance.

Interestingly, our data also suggest that G1816 (TRY) overexpressionlines had a glucose sugar sensing phenotype. Several sugar sensingmutants have turned out to be allelic to ABA and ethylene mutants. Thispotentially implicates G1816 in hormone signaling.

As noted below, overexpression of a number of Arabidopsis andnon-Arabidopsis the members of the G682 subclade of MYB-relatedtranscription factors conferred increased abiotic stress tolerance intransgenic plants, as compared to non-transgenic plants of the samespecies a (i.e., non-transformed plant that did not overexpress thesepolypeptides).

Table 3 shows the polypeptides identified by SEQ ID NO; Gene ID (GID)No.; the transcription factor family to which the polypeptide belongs,and conserved MYB-related domains of the polypeptide. The first columnshows the polypeptide SEQ ID NO; the second column the species and GID;the third column shows the MYB-related domain; the fourth column showsthe amino acid coordinates of the conserved domain that were used todetermine the percentage identity of that conserved domain to theMYB-related domain of G682; and the fifth column shows the percentageidentity to G682. The sequences are arranged in descending order ofpercentage identity to G682.

TABLE 3 Gene families and MYB-related domains Species/ % ID MYB- GIDNo., Amino Acid related Accession Coordinates for conserved PolypeptideNo., or Percent Identity domain SEQ ID NO: Identifier MYB-related DomainDetermination of G682 148 At/G682 EWEVVNMSQEEEDLVSRMHKL 27-63 100% VGDRWELIAGRIPGRT 559 Os/G3393 AQNFVHFTEEEEDLVFRMHRLV 33-69 78%GNRWELIAGRIPGRT 2192 At/G2718 EWEEIAMAQEEEDLICRMYKLV 28-64 75%GERWDLIAGRIPGRT 1082 Os/G3392 AQNFVHFTEEEEDIVFRMHRLV 28-64 75%GNRWELIAGRIPGRT 1089 Zm/G3431 AQHLVDFTEAEEDLVSRMHRLV 28-64 73%GNRWEIIAGRIPGRT 1084 Gm/G3450 EWKVIHMSEQEEDLIRRMYKLV 16-52 72%GDKWNLIAGRIPGRK 2142 At/G1816 EWEFINMTEQEEDLIFRMYRLV 26-62 69%GDRWDLIAGRVPGRQ 1088 Gm/G3449 GSSKVEFSEDEETLIIRMYKLVG 22-58 69%ERWSLIAGRIPGRT 1087 Gm/G3448 GSSKVEFSEDEETLIIRMYKLVG 22-58 66%ERWSIIAGRIPGRT 38 At/G226 EWEFISMTEQEEDLISRMYRLV 34-70 63%GNRWDLIAGRVVGRK 1086 Gm/G3446 QVSDVEFSEAEEILIAMVYNLVG 22-58 60%ERWSLIAGRIPGRT 1083 Gm/G3445 QVSDVEFSEAEEILIAMVYNLVG 21-57 60%ERWSLIAGRIPGRT Abbreviations At Arabidopsis thaliana Gm Glycine max OsOryza sativa Zm Zea mays

G682 and its paralogs and orthologs are composed (almost entirely) of asingle MYB-repeat DNA binding domain that is highly conserved acrossplant species. An alignment of the G682-like proteins from Arabidopsis,soybean, rice and corn that are being analyzed in this trait module isshown in FIGS. 3A and 3B. No function has been assigned to any of theseMYB-related genes outside of Arabidopsis.

Because the G682 clade members are short proteins that are comprisedalmost exclusively of a DNA binding motif, it is likely that theyfunction as repressors. This is consistent with in expression analysesindicating that CPC represses its own transcription as well as that ofWER and GL2 (Wada et al. (2002) supra; Lee and Schiefelbein (2002)supra). Repression may occur at the level of DNA binding throughcompetition with other factors at target promoters, although repressionvia protein-protein interactions cannot be excluded.

The AP2 Family, Including the G47/G2133 and G1792 Clades.

AP2 (APETALA2) and EREBPs (Ethylene-Responsive Element Binding Proteins)are the prototypic members of a family of transcription factors uniqueto plants, whose distinguishing characteristic is that they contain theso-called AP2 DNA-binding domain (for a review, see Riechmann andMeyerowitz (1998) Biol. Chem. 379: 633-646). The AP2 domain was firstrecognized as a repeated motif within the Arabidopsis thaliana AP2protein (Jofuku et al. (1994) Plant Cell 6: 1211-1225). Shortlyafterwards, four DNA-binding proteins from tobacco were identified thatinteract with a sequence that is essential for the responsiveness ofsome promoters to the plant hormone ethylene, and were designated asethylene-responsive element binding proteins (EREBPs; Ohme-Takagi et al.(1995) Plant Cell 7: 173-182). The DNA-binding domain of EREBP-2 wasmapped to a region that was common to all four proteins (Ohme-Takagi etal (1995) supra), and that was found to be closely related to the AP2domain (Weigel (1995) Plant Cell 7: 388-389) but that did not bearsequence similarity to previously known DNA-binding motifs.

AP2/EREBP genes form a large family, with many members known in severalplant species (Okamuro et al. (1997) Proc. Natl. Acad. Sci. USA 94:7076-7081; Riechmann and Meyerowitz (1998) supra). The number ofAP2/EREBP genes in the Arabidopsis thaliana genome is approximately 145(Riechmann et al. (2000) Science 290: 2105-2110). The APETALA2 class ischaracterized by the presence of two AP2 DNA binding domains, andcontains 14 genes. The AP2/ERF is the largest subfamily, and includes125 genes which are involved in abiotic (DREB subgroup) and biotic (ERFsubgroup) stress responses and the RAV subgroup includes 6 genes whichall have a B3 DNA binding domain in addition to the AP2 DNA bindingdomain (Kagaya et al. (1999) Nucleic Acids Res. 27: 470-478).

Arabidopsis AP2 is involved in the specification of sepal and petalidentity through its activity as a homeotic gene that forms part of thecombinatorial genetic mechanism of floral organ identity determinationand it is also required for normal ovule and seed development (Bowman etal. (1991) Development 112: 1-20; Jofuku et al. (1994) supra).Arabidopsis ANT is required for ovule development and it also plays arole in floral organ growth (Elliott et al. (1996) Plant Cell 8:155-168; Klucher et al. (1996) Plant Cell 8: 137-153). Finally, maizeG115 regulates leaf epidermal cell identity (Moose et al. (1996) GenesDev. 10: 3018-3027).

The attack of a plant by a pathogen may induce defense responses thatlead to resistance to the invasion, and these responses are associatedwith transcriptional activation of defense-related genes, among themthose encoding pathogenesis-related (PR) proteins. The involvement ofEREBP-like genes in controlling the plant defense response is based onthe observation that many PR gene promoters contain a short cis-actingelement that mediates their responsiveness to ethylene (ethylene appearsto be one of several signal molecules controlling the activation ofdefense responses). Tobacco EREBP-1, -2, -3, and -4, and tomato Pti4,Pti5 and Pti6 proteins have been shown to recognize such cis-actingelements (Ohme-Takagi (1995) supra; Zhou et al. (1997) EMBO J. 16:3207-3218). In addition, Pti4, Pti5, and Pti6 proteins have been shownto directly interact with Pto, a protein kinase that confers resistanceagainst Pseudomonas syringae pv tomato (Zhou et al. (1997) supra).Plants are also challenged by adverse environmental conditions like coldor drought, and EREBP-like proteins appear to be involved in theresponses to these abiotic stresses as well. COR (for cold-regulated)gene expression is induced during cold acclimation, the process by whichplants increase their resistance to freezing in response to lowunfreezing temperatures. The Arabidopsis EREBP-like gene CBF1(Stockinger et al. (1997) Proc. Natl. Acad. Sci. USA 94: 1035-1040) is aregulator of the cold acclimation response, because ectopic expressionof CBF1 in Arabidopsis transgenic plants induced COR gene expression inthe absence of a cold stimulus, and the plant freezing tolerance wasincreased (Jaglo-Ottosen et al. (1998) Science 280: 104-106). Finally,another Arabidopsis EREBP-like gene, ABI4, is involved in abscisic acid(ABA) signal transduction, because abi4 mutants are insensitive to ABA(ABA is a plant hormone that regulates many agronomically importantaspects of plant development; Finkelstein et al. (1998) Plant Cell 10:1043-1054).

The SCR Family, Including the G922 Clade.

The SCARECROW gene, which regulates an asymmetric cell divisionessential for proper radial organization of root cell layers, wasisolated from Arabidopsis thaliana by screening a genomic library withsequences flanking a T-DNA insertion causing a “scarecrow” mutation (DiLaurenzio et al. (1996) Cell 86, 423-433). The gene product wastentatively described as a transcription factor based on the presence ofhomopolymeric stretches of several amino acids, the presence of a basicdomain similar to that of the basic-leucine zipper family oftranscription factors, and the presence of leucine heptad repeats. Thepresence of several Arabidopsis ESTs with gene products homologous tothe SCARECROW gene were noted. The ability of the SCARECROW gene tocomplement the scarecrow mutation was also demonstrated (Malamy et al.(1997) Plant J. 12, 957-963).

More recently, the SCARECROW homologue RGA, which encodes a negativeregulator of the gibberellin signal transduction pathway, was isolatedfrom Arabidopsis by genomic subtraction (Silverstone et al. (1998) PlantCell 10, 155-169). The RGA gene was shown to be expressed in manydifferent tissues and the RGA protein was shown to be localized to thenucleus. The same gene was isolated by Truong (Truong et al. (1997) FEBSLett. 410: 213-218) by identifying cDNA clones which complement a yeastnitrogen metabolism mutant, suggesting that RGA may be involved inregulating diverse metabolic processes. Another SCARECROW homologuedesignated GAI, which also is involved in gibberellin signalingprocesses, has been isolated by Peng (Peng et al. (1997) Genes Dev. 11,3194-3205). Interestingly, GAI is the gene that initiated the GreenRevolution. Peng et al. (Peng et al. (1999) Nature 6741, 256-261) haverecently shown that maize GAM orthologs, when mutated, result in plantsthat are shorter, have increased seed yield, and are more resistant todamage by rain and wind than wild type plants. Based on the inclusion ofthe GAI, RGA and SCR genes in this family, it has also been referred toas the GRAS family (Pysh et al. (1999) Plant J 18, 111-19).

The scarecrow gene family has 32 members in the Arabidopsis genome.

The NAC Family, Including the G2053 Clade.

The NAC family is a group of transcription factors that share a highlyconserved N-terminal domain of about 150 amino acids, designated the NACdomain (NAC stands for Petunia, NAM, and Arabidopsis, ATAF1, ATAF2 andCUC2). This is believed to be a novel domain that is present in bothmonocot and dicot plants but is absent from yeast and animal proteins.One hundred and twelve members of the NAC family have been identified inthe Arabidopsis genome. The NAC class of proteins can be divided into atleast two sub-families on the basis of amino acid sequence similaritieswithin the NAC domain. One sub-family is built around the NAM and CUC2(cup-shaped cotyledon) proteins whilst the other sub-family containsfactors with a NAC domain similar to those of ATAF1 and ATAF2.

Thus far, little is known about the function of different NAC familymembers. This is surprising given that there are 113 members inArabidopsis. However, NAM, CUC1 and CUC2 are thought to have vital rolesin the regulation of embryo and flower development. In Petunia, nammutant embryos fail to develop a shoot apical meristem (SAM) and havefused cotyledons. These mutants sometimes generate escape shoots thatproduce defective flowers with extra petals and fused organs. InArabidopsis, the cud and cuc2 mutations have somewhat similar effects,causing defects in SAM formation and the separation of cotyledons,sepals and stamens.

Although nam and cuc mutants exhibit comparable defects duringembryogenesis, the penetrance of these phenotypes is much lower in cucmutants. Functional redundancy of the CUC genes in Arabidopsis mayexplain this observation. In terms of the flower phenotype there arenotable differences between nam and cuc mutants. Flowers of cuc mutantsdo not contain additional organs and the formation of sepals and stamensis most strongly affected. In nam mutants, by contrast, the flowers docarry additional organs and petal formation is more markedly affectedthan that of other floral organs. These apparent differences might beexplained in two ways: the NAM and CUC proteins have been recruited intodifferent roles in development of Arabidopsis and Petunia flowers.Alternatively, the proteins could share a common function between thetwo species, with the different mutant floral phenotypes arising fromvariations in the way other genes (that participate in the samedevelopmental processes) are affected by defects in NAM or CUC.

A further gene from this family, NAP (NAC-like activated by AP3/PI) isalso involved in flower development and is thought to influence thetransition between cell division and cell expansion in stamens andpetals. Overall, then, the NAC proteins mainly appear to regulatedevelopmental processes.

Producing Polypeptides

The polynucleotides of the invention include sequences that encodetranscription factors and transcription factor homolog polypeptides andsequences complementary thereto, as well as unique fragments of codingsequence, or sequence complementary thereto. Such polynucleotides canbe, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, cDNAsynthetic DNA, oligonucleotides, etc. The polynucleotides are eitherdouble-stranded or single-stranded, and include either, or both sense(i.e., coding) sequences and antisense (i.e., non-coding, complementary)sequences. The polynucleotides include the coding sequence of atranscription factor, or transcription factor homolog polypeptide, inisolation, in combination with additional coding sequences (e.g., apurification tag, a localization signal, as a fusion-protein, as apre-protein, or the like), in combination with non-coding sequences(e.g., introns or inteins, regulatory elements such as promoters,enhancers, terminators, and the like), and/or in a vector or hostenvironment in which the polynucleotide encoding a transcription factoror transcription factor homolog polypeptide is an endogenous orexogenous gene.

A variety of methods exist for producing the polynucleotides of theinvention. Procedures for identifying and isolating DNA clones are wellknown to those of skill in the art, and are described in, e.g., Bergerand Kimmel, Guide to Molecular Cloning Techniques, Methods inEnzymology, vol. 152 Academic Press, Inc., San Diego, Calif. (“Berger”);Sambrook et al. (1989) Molecular Cloning—A Laboratory Manual (2nd Ed.),Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., andCurrent Protocols in Molecular Biology, Ausubel et al. eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., (supplemented through 2000) (“Ausubel”).

Alternatively, polynucleotides of the invention, can be produced by avariety of in vitro amplification methods adapted to the presentinvention by appropriate selection of specific or degenerate primers.Examples of protocols sufficient to direct persons of skill through invitro amplification methods, including the polymerase chain reaction(PCR) the ligase chain reaction (LCR), Qbeta-replicase amplification andother RNA polymerase mediated techniques (e.g., NASBA), e.g., for theproduction of the homologous nucleic acids of the invention are found inBerger (supra), Sambrook (supra), and Ausubel (supra), as well as Mulliset al. (1987) PCR Protocols A Guide to Methods and Applications (Inniset al. eds) Academic Press Inc. San Diego, Calif. (1990) (Innis).Improved methods for cloning in vitro amplified nucleic acids aredescribed in Wallace et al. U.S. Pat. No. 5,426,039. Improved methodsfor amplifying large nucleic acids by PCR are summarized in Cheng et al.(1994) Nature 369: 684-685 and the references cited therein, in whichPCR amplicons of up to 40 kb are generated. One of skill will appreciatethat essentially any RNA can be converted into a double stranded DNAsuitable for restriction digestion, PCR expansion and sequencing usingreverse transcriptase and a polymerase. See, e.g., Ausubel, Sambrook andBerger, all supra.

Alternatively, polynucleotides and oligonucleotides of the invention canbe assembled from fragments produced by solid-phase synthesis methods.Typically, fragments of up to approximately 100 bases are individuallysynthesized and then enzymatically or chemically ligated to produce adesired sequence, e.g., a polynucleotide encoding all or part of atranscription factor. For example, chemical synthesis using thephosphoramidite method is described, e.g., by Beaucage et al. (1981)Tetrahedron Letters 22: 1859-1869; and Matthes et al. (1984) EMBO J. 3:801-805. According to such methods, oligonucleotides are synthesized,purified, annealed to their complementary strand, ligated and thenoptionally cloned into suitable vectors. And if so desired, thepolynucleotides and polypeptides of the invention can be custom orderedfrom any of a number of commercial suppliers.

Homologous Sequences

Sequences homologous, i.e., that share significant sequence identity orsimilarity, to those provided in the Sequence Listing, derived fromArabidopsis thaliana or from other plants of choice, are also an aspectof the invention. Homologous sequences can be derived from any plantincluding monocots and dicots and in particular agriculturally importantplant species, including but not limited to, crops such as soybean,wheat, corn (maize), potato, cotton, rice, rape, oilseed rape (includingcanola), sunflower, alfalfa, clover, sugarcane, and turf; or fruits andvegetables, such as banana, blackberry, blueberry, strawberry, andraspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant,grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers,pineapple, pumpkin, spinach, squash, sweet corn, tobacco, tomato,tomatillo, watermelon, rosaceous fruits (such as apple, peach, pear,cherry and plum) and vegetable brassicas (such as broccoli, cabbage,cauliflower, Brussels sprouts, and kohlrabi). Other crops, includingfruits and vegetables, whose phenotype can be changed and which comprisehomologous sequences include barley; rye; millet; sorghum; currant;avocado; citrus fruits such as oranges, lemons, grapefruit andtangerines, artichoke, cherries; nuts such as the walnut and peanut;endive; leek; roots such as arrowroot, beet, cassaya, turnip, radish,yam, and sweet potato; and beans. The homologous sequences may also bederived from woody species, such pine, poplar and eucalyptus, or mint orother labiates. In addition, homologous sequences may be derived fromplants that are evolutionarily-related to crop plants, but which may nothave yet been used as crop plants. Examples include deadly nightshade(Atropa belladona), related to tomato; jimson weed (Datura strommium),related to peyote; and teosinte (Zea species), related to corn (maize).

Orthologs and Paralogs

Homologous sequences as described above can comprise orthologous orparalogous sequences. Several different methods are known by those ofskill in the art for identifying and defining these functionallyhomologous sequences. Three general methods for defining orthologs andparalogs are described; an ortholog or paralog, including equivalogs,may be identified by one or more of the methods described below.

Orthologs and paralogs are evolutionarily related genes that havesimilar sequence and similar functions. Orthologs are structurallyrelated genes in different species that are derived by a speciationevent. Paralogs are structurally related genes within a single speciesthat are derived by a duplication event.

Within a single plant species, gene duplication may cause two copies ofa particular gene, giving rise to two or more genes with similarsequence and often similar function known as paralogs. A paralog istherefore a similar gene formed by duplication within the same species.Paralogs typically cluster together or in the same lade (a group ofsimilar genes) when a gene family phylogeny is analyzed using programssuch as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods Enzymol. 266: 383-402). Groupsof similar genes can also be identified with pair-wise BLAST analysis(Feng and Doolittle (1987) J. Mol. Evol. 25: 351-360). For example, aclade of very similar MADS domain transcription factors from Arabidopsisall share a common function in flowering time (Ratcliffe et al. (2001)Plant Physiol. 126: 122-132), and a group of very similar AP2 domaintranscription factors from Arabidopsis are involved in tolerance ofplants to freezing (Gilmour et al. (1998) Plant J. 16: 433-442).Analysis of groups of similar genes with similar function that fallwithin one clade can yield sub-sequences that are particular to thelade. These sub-sequences, known as consensus sequences, can not only beused to define the sequences within each lade, but define the functionsof these genes; genes within a lade may contain paralogous sequences, ororthologous sequences that share the same function (see also, forexample, Mount (2001), in Bioinformatics: Sequence and Genome Analysis,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., page543.)

Speciation, the production of new species from a parental species, canalso give rise to two or more genes with similar sequence and similarfunction. These genes, termed orthologs, often have an identicalfunction within their host plants and are often interchangeable betweenspecies without losing function. Because plants have common ancestors,many genes in any plant species will have a corresponding orthologousgene in another plant species. Once a phylogenic tree for a gene familyof one species has been constructed using a program such as CLUSTAL(Thompson et al. (1994) Nucleic Acids Res. 22: 4673-4680; Higgins et al.(1996) supra) potential orthologous sequences can be placed into thephylogenetic tree and their relationship to genes from the species ofinterest can be determined. Orthologous sequences can also be identifiedby a reciprocal BLAST strategy. Once an orthologous sequence has beenidentified, the function of the ortholog can be deduced from theidentified function of the reference sequence.

Transcription factor gene sequences are conserved across diverseeukaryotic species lines (Goodrich et al. (1993) Cell 75: 519-530; Linet al. (1991) Nature 353: 569-571; Sadowsli et al. (1988) Nature 335:563-564). et al. Plants are no exception to this observation; diverseplant species possess transcription factors that have similar sequencesand functions.

Orthologous genes from different organisms have highly conservedfunctions, and very often essentially identical functions (Lee et al.(2002) Genome Res. 12: 493-502; Remm et al. (2001) J. Mol. Biol. 314:1041-1052). Paralogous genes, which have diverged through geneduplication, may retain similar functions of the encoded proteins. Insuch cases, paralogs can be used interchangeably with respect to certainembodiments of the instant invention (for example, transgenic expressionof a coding sequence). An example of such highly related paralogs is theCBF family, with three well-defined members in Arabidopsis and at leastone ortholog in Brassica napus (SEQ ID NOs: 1956, 1958, 1960, or 2204,respectively), all of which control pathways involved in both freezingand drought stress (Gilmour et al. (1998) Plant J. 16: 433-442; Jaglo etal. (1998) Plant Physiol. 127: 910-917).

The following references represent a small sampling of the many studiesthat demonstrate that conserved transcription factor genes from diversespecies are likely to function similarly (i.e., regulate similar targetsequences and control the same traits), and that transcription factorsmay be transformed into diverse species to confer or improve traits.

(1) The Arabidopsis NPR1 gene regulates systemic acquired resistance(SAR); over-expression of NPR1 leads to enhanced resistance inArabidopsis. When either Arabidopsis NPR1 or the rice NPR1 ortholog wasoverexpressed in rice (which, as a monocot, is diverse fromArabidopsis), challenge with the rice bacterial blight pathogenXanthomonas oryzae pv. Oryzae, the transgenic plants displayed enhancedresistance (Chern et al. (2001) Plant J. 27: 101-113). NPR1 acts throughactivation of expression of transcription factor genes, such as TGA2(Fan and Dong (2002) Plant Cell 14: 1377-1389).

(2) E2F genes are involved in transcription of plant genes forproliferating cell nuclear antigen (PCNA). Plant E2Fs share a highdegree of similarity in amino acid sequence between monocots and dicots,and are even similar to the conserved domains of the animal E2Fs. Suchconservation indicates a functional similarity between plant and animalE2Fs. E2F transcription factors that regulate meristem development actthrough common cis-elements, and regulate related (PCNA) genes (Kosugiand Ohashi, (2002) Plant J. 29: 45-59).

(3) The ABI5 gene (abscisic acid (ABA) insensitive 5) encodes a basicleucine zipper factor required for ABA response in the seed andvegetative tissues. Co-transformation experiments with ABI5 cDNAconstructs in rice protoplasts resulted in specific transactivation ofthe ABA-inducible wheat, Arabidopsis, bean, and barley promoters. Theseresults demonstrate that sequentially similar ABI5 transcription factorsare key targets of a conserved ABA signaling pathway in diverse plants.(Gampala et al. (2001) J. Biol. Chem. 277: 1689-1694).

(4) Sequences of three Arabidopsis GAMYB-like genes were obtained on thebasis of sequence similarity to GAMYB genes from barley, rice, and L.temulentum. These three Arabadopsis genes were determined to encodetranscription factors (AtMYB33, AtMYB65, and AtMYB101) and couldsubstitute for a barley GAMYB and control alpha-amylase expression(Gocal et al. (2001) Plant Physiol. 127: 1682-1693).

(5) The floral control gene LEAFY from Arabidopsis can dramaticallyaccelerate flowering in numerous dictoyledonous plants. Constitutiveexpression of Arabidopsis LEAFY also caused early flowering intransgenic rice (a monocot), with a heading date that was 26-34 daysearlier than that of wild-type plants. These observations indicate thatfloral regulatory genes from Arabidopsis are useful tools for headingdate improvement in cereal crops (He et al. (2000) Transgenic Res. 9:223-227).

(6) Bioactive gibberellins (GAs) are essential endogenous regulators ofplant growth. GA signaling tends to be conserved across the plantkingdom. GA signaling is mediated via GAI, a nuclear member of the GRASfamily of plant transcription factors. Arabidopsis GAI has been shown tofunction in rice to inhibit gibberellin response pathways (Fu et al.(2001) Plant Cell 13: 1791-1802).

(7) The Arabidopsis gene SUPERMAN (SUP), encodes a putativetranscription factor that maintains the boundary between stamens andcarpels. By over-expressing Arabidopsis SUP in rice, the effect of thegene's presence on whorl boundaries was shown to be conserved. Thisdemonstrated that SUP is a conserved regulator of floral whorlboundaries and affects cell proliferation (Nandi et al. (2000) Curr.Biol. 10: 215-218).

(8) Maize, petunia and Arabidopsis myb transcription factors thatregulate flavonoid biosynthesis are very genetically similar and affectthe same trait in their native species, therefore sequence and functionof these myb transcription factors correlate with each other in thesediverse species (Borevitz et al. (2000) Plant Cell 12: 2383-2394).

(9) Wheat reduced height-1 (Rht-B1/Rht-D1) and maize dwarf-8 (d8) genesare orthologs of the Arabidopsis gibberellin insensitive (GAI) gene.Both of these genes have been used to produce dwarf grain varieties thathave improved grain yield. These genes encode proteins that resemblenuclear transcription factors and contain an SH2-like domain, indicatingthat phosphotyrosine may participate in gibberellin signaling.Transgenic rice plants containing a mutant GAI allele from Arabidopsishave been shown to produce reduced responses to gibberellin and aredwarfed, indicating that mutant GAI orthologs could be used to increaseyield in a wide range of crop species (Peng et al. (1999) Nature 400:256-261).

Transcription factors that are homologous to the listed sequences willtypically share, in at least one conserved domain, at least about 70%amino acid sequence identity, and with regard to zinc fingertranscription factors, at least about 50% amino acid sequence identity.More closely related transcription factors can share at least about 70%,or about 75% or about 80% or about 90% or about 95% or about 98% or moresequence identity with the listed sequences, or with the listedsequences but excluding or outside a known consensus sequence orconsensus DNA-binding site, or with the listed sequences excluding oneor all conserved domain. Factors that are most closely related to thelisted sequences share, e.g., at least about 85%, about 90% or about 95%or more % sequence identity to the listed sequences, or to the listedsequences but excluding or outside a known consensus sequence orconsensus DNA-binding site or outside one or all conserved domain. Atthe nucleotide level, the sequences will typically share at least about40% nucleotide sequence identity, preferably at least about 50%, about60%, about 70% or about 80% sequence identity, and more preferably about85%, about 90%, about 95% or about 97% or more sequence identity to oneor more of the listed sequences, or to a listed sequence but excludingor outside a known consensus sequence or consensus DNA-binding site, oroutside one or all conserved domain. The degeneracy of the genetic codeenables major variations in the nucleotide sequence of a polynucleotidewhile maintaining the amino acid sequence of the encoded protein.Conserved domains within a transcription factor family may exhibit ahigher degree of sequence homology, such as at least 65% amino acidsequence identity including conservative substitutions, and preferablyat least 80% sequence identity, and more preferably at least 85%, or atleast about 86%, or at least about 87%, or at least about 88%, or atleast about 90%, or at least about 95%, or at least about 98% sequenceidentity. Transcription factors that are homologous to the listedsequences should share at least 30%, or at least about 60%, or at leastabout 75%, or at least about 80%, or at least about 90%, or at leastabout 95% amino acid sequence identity over the entire length of thepolypeptide or the homolog.

Percent identity can be determined electronically, e.g., by using theMEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program cancreate alignments between two or more sequences according to differentmethods, for example, the clustal method. (See, for example, Higgins andSharp (1988) Gene 73: 237-244.) The clustal algorithm groups sequencesinto clusters by examining the distances between all pairs. The clustersare aligned pairwise and then in groups. Other alignment algorithms orprograms may be used, including FASTA, BLAST, or ENTREZ, FASTA andBLAST, and which may be used to calculate percent similarity. These areavailable as a part of the GCG sequence analysis package (University ofWisconsin, Madison, Wis.), and can be used with or without defaultsettings. ENTREZ is available through the National Center forBiotechnology Information. In one embodiment, the percent identity oftwo sequences can be determined by the GCG program with a gap weight of1, e.g., each amino acid gap is weighted as if it were a single aminoacid or nucleotide mismatch between the two sequences (see U.S. Pat. No.6,262,333).

Other techniques for alignment are described in Doolittle, R. F. (1996)Methods in Enzymology: Computer Methods for Macromolecular SequenceAnalysis, vol. 266, Academic Press, Orlando, Fla., USA. Preferably, analignment program that permits gaps in the sequence is utilized to alignthe sequences. The Smith-Waterman is one type of algorithm that permitsgaps in sequence alignments (see Shpaer (1997) Methods Mol. Biol. 70:173-187). Also, the GAP program using the Needleman and Wunsch alignmentmethod can be utilized to align sequences. An alternative searchstrategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCHuses a Smith-Waterman algorithm to score sequences on a massivelyparallel computer. This approach improves ability to pick up distantlyrelated matches, and is especially tolerant of small gaps and nucleotidesequence errors. Nucleic acid-encoded amino acid sequences can be usedto search both protein and DNA databases.

The percentage similarity between two polypeptide sequences, e.g.,sequence A and sequence B, is calculated by dividing the length ofsequence A, minus the number of gap residues in sequence A, minus thenumber of gap residues in sequence B, into the sum of the residuematches between sequence A and sequence B, times one hundred. Gaps oflow or of no similarity between the two amino acid sequences are notincluded in determining percentage similarity. Percent identity betweenpolynucleotide sequences can also be counted or calculated by othermethods known in the art, e.g., the Jotun Hein method. (See, e.g., Hein(1990) Methods Enzymol. 183: 626-645.) Identity between sequences canalso be determined by other methods known in the art, e.g., by varyinghybridization conditions (see US Patent Application No. 20010010913).

The percent identity between two conserved domains of a transcriptionfactor DNA-binding domain consensus polypeptide sequence can be as lowas 16%, as exemplified in the case of GATA1 family of eukaryoticCys₂/Cys₂-type zinc finger transcription factors. The DNA-binding domainconsensus polypeptide sequence of the GATA1 family is CX₂CX₁₇CX₂C, whereX is any amino acid residue. (See, for example, Takatsuji, supra.) Otherexamples of such conserved consensus polypeptide sequences with lowoverall percent sequence identity are well known to those of skill inthe art.

Thus, the invention provides methods for identifying a sequence similaror paralogous or orthologous or homologous to one or morepolynucleotides as noted herein, or one or more target polypeptidesencoded by the polynucleotides, or otherwise noted herein and mayinclude linking or associating a given plant phenotype or gene functionwith a sequence. In the methods, a sequence database is provided(locally or across an internet or intranet) and a query is made againstthe sequence database using the relevant sequences herein and associatedplant phenotypes or gene functions.

In addition, one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used tosearch against a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other databases which contain previously identifiedand annotated motifs, sequences and gene functions. Methods that searchfor primary sequence patterns with secondary structure, gap penalties(Smith et al. (1992) Protein Engineering 5: 35-51) as well as algorithmssuch as Basic Local Alignment Search Tool (BLAST; Altschul (1993) J.Mol. Evol. 36: 290-300; Altschul et al. (1990) supra), BLOCKS (Henikoffand Henikoff (1991) Nucleic Acids Res. 19: 6565-6572), Hidden MarkovModels (HMM; Eddy (1996) Curr. Opin. Str. Biol. 6: 361-365; Sonnhammeret al. (1997) Proteins 28: 405-420), and the like, can be used tomanipulate and analyze polynucleotide and polypeptide sequences encodedby polynucleotides. These databases, algorithms and other methods arewell known in the art and are described in Ausubel et al. (1997; ShortProtocols in Molecular Biology, John Wiley & Sons, New York, N.Y., unit7.7) and in Meyers (1995; Molecular Biology and Biotechnology, WileyVCH, New York, N.Y., p 856-853).

Furthermore, methods using manual alignment of sequences similar orhomologous to one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used toidentify regions of similarity and conserved domains. Such manualmethods are well-known of those of skill in the art and can include, forexample, comparisons of tertiary structure between a polypeptidesequence encoded by a polynucleotide which comprises a known functionwith a polypeptide sequence encoded by a polynucleotide sequence whichhas a function not yet determined. Such examples of tertiary structuremay comprise predicted alpha helices, beta-sheets, amphipathic helices,leucine zipper motifs, zinc finger motifs, proline-rich regions,cysteine repeat motifs, and the like.

Orthologs and paralogs of presently disclosed transcription factors maybe cloned using compositions provided by the present invention accordingto methods well known in the art. cDNAs can be cloned using mRNA from aplant cell or tissue that expresses one of the present transcriptionfactors. Appropriate mRNA sources may be identified by interrogatingNorthern blots with probes designed from the present transcriptionfactor sequences, after which a library is prepared from the mRNAobtained from a positive cell or tissue. Transcription factor-encodingcDNA is then isolated using, for example, PCR, using primers designedfrom a presently disclosed transcription factor gene sequence, or byprobing with a partial or complete cDNA or with one or more sets ofdegenerate probes based on the disclosed sequences. The cDNA library maybe used to transform plant cells. Expression of the cDNAs of interest isdetected using, for example, methods disclosed herein such asmicroarrays, Northern blots, quantitative PCR, or any other techniquefor monitoring changes in expression. Genomic clones may be isolatedusing similar techniques to those.

In addition to the Sequences listed in the Sequence Listing, theinvention encompasses isolated nucleotide sequences that aresequentially and structurally similar to G481, G482, G485, and G682, SEQID NO: 87, 89, 2009, and 147, respectively, and function in a plant in amanner similar to G481, G482, G485 and G867 by regulating abiotic stresstolerance.

The nucleotide sequences of the G482 subclade of the non-LEC1-like cladeof proteins of the L1L-related CCAAT transcription factor family,including G481, share at least 81% identity in their B domains with theB domain of G481 (Table 1). Sequences outside of this subclade,including L1L (NP_(—)199578) and LEC1 (AAC39488) share significantlyless identity with G481 (Table 1), are phylogenetically distinct fromthe members of this subclade (FIGS. 6 and 7), and appear to function inembryonic development rather than in abiotic stress tolerance, as notedabove.

Since the members of this subclade are phylogenetically related,sequentially similar and a representative number from diverse plantspecies have been shown to regulate abiotic stress tolerance, oneskilled in the art would predict that other similar, phylogeneticallyrelated sequences would also regulate abiotic stress tolerance.

Similar to the G481 subclade, G682 and similar sequences in the G682subclade of MYB-related transcription factors are phylogeneticallyrelated, sequentially similar and a representative number from diverseplant species have been shown to regulate abiotic stress tolerance. Thiswould prompt one skilled in the art to draw similar conclusionsregarding the regulation of these sequences of abiotic stress tolerance.A representative number of the members of this clade, includingsequences derived from diverse non-Arabidopsis species, have been shownto confer abiotic stress tolerance when overexpressed. These sequenceshave been shown to share 60% identity in their MYB-related domains(Table 3).

Identifying Polynucleotides or Nucleic Acids by Hybridization

Polynucleotides homologous to the sequences illustrated in the SequenceListing and tables can be identified, e.g., by hybridization to eachother under stringent or under highly stringent conditions. Singlestranded polynucleotides hybridize when they associate based on avariety of well characterized physical-chemical forces, such as hydrogenbonding, solvent exclusion, base stacking and the like. The stringencyof a hybridization reflects the degree of sequence identity of thenucleic acids involved, such that the higher the stringency, the moresimilar are the two polynucleotide strands. Stringency is influenced bya variety of factors, including temperature, salt concentration andcomposition, organic and non-organic additives, solvents, etc. presentin both the hybridization and wash solutions and incubations (and numberthereof), as described in more detail in the references cited above.

Encompassed by the invention are polynucleotide sequences that arecapable of hybridizing to the claimed polynucleotide sequences,including any of the transcription factor polynucleotides within theSequence Listing, and fragments thereof under various conditions ofstringency (See, for example, Wahl and Berger (1987) Methods Enzymol.152: 399-407; and Kimmel (1987) Methods Enzymol. 152: 507-511). Inaddition to the nucleotide sequences listed in Tables 7 and 8, fulllength cDNA, orthologs, and paralogs of the present nucleotide sequencesmay be identified and isolated using well-known methods. The cDNAlibraries orthologs, and paralogs of the present nucleotide sequencesmay be screened using hybridization methods to determine their utilityas hybridization target or amplification probes.

With regard to hybridization, conditions that are highly stringent, andmeans for achieving them, are well known in the art. See, for example,Sambrook et al. (1989) “Molecular Cloning: A Laboratory Manual” (2nded., Cold Spring Harbor Laboratory); Berger and Kimmel, eds., (1987)“Guide to Molecular Cloning Techniques”, In Methods in Enzymology: 152:467-469; and Anderson and Young (1985) “Quantitative FilterHybridisation.” In: Hames and Higgins, ed., Nucleic Acid Hybridisation,A Practical Approach. Oxford, IRL Press, 73-111.

Stability of DNA duplexes is affected by such factors as basecomposition, length, and degree of base pair mismatch. Hybridizationconditions may be adjusted to allow DNAs of different sequencerelatedness to hybridize. The melting temperature (T_(m)) is defined asthe temperature when 50% of the duplex molecules have dissociated intotheir constituent single strands. The melting temperature of a perfectlymatched duplex, where the hybridization buffer contains formamide as adenaturing agent, may be estimated by the following equation:DNA-DNA: T _(m)(° C.)=81.5+16.6(log [Na+])+0.41(% G+C)−0.62(%formamide)−500/L  (1)DNA-RNA: T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(%G+C)²−0.5(% formamide)−820/L  (2)RNA-RNA: T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(%G+C)²−0.35(% formamide)−820/L  (3)

where L is the length of the duplex formed, [Na+] is the molarconcentration of the sodium ion in the hybridization or washingsolution, and % G+C is the percentage of (guanine+cytosine) bases in thehybrid. For imperfectly matched hybrids, approximately 1° C. is requiredto reduce the melting temperature for each 1-% mismatch.

Hybridization experiments are generally conducted in a buffer of pHbetween 6.8 to 7.4, although the rate of hybridization is nearlyindependent of pH at ionic strengths likely to be used in thehybridization buffer (Anderson et al. (1985) supra). In addition, one ormore of the following may be used to reduce non-specific hybridization:sonicated salmon sperm DNA or another non-complementary DNA, bovineserum albumin, sodium pyrophosphate, sodium dodecylsulfate (SDS),polyvinyl-pyrrolidone, ficoll and Denhardt's solution. Dextran sulfateand polyethylene glycol 6000 act to exclude DNA from solution, thusraising the effective probe DNA concentration and the hybridizationsignal within a given unit of time. In some instances, conditions ofeven greater stringency may be desirable or required to reducenon-specific and/or background hybridization. These conditions may becreated with the use of higher temperature, lower ionic strength andhigher concentration of a denaturing agent such as formamide.

Stringency conditions can be adjusted to screen for moderately similarfragments such as homologous sequences from distantly related organisms,or to highly similar fragments such as genes that duplicate functionalenzymes from closely related organisms. The stringency can be adjustedeither during the hybridization step or in the post-hybridizationwashes. Salt concentration, formamide concentration, hybridizationtemperature and probe lengths are variables that can be used to alterstringency (as described by the formula above). As a general guidelineshigh stringency is typically performed at T_(m)−5° C. to T_(m)−20° C.,moderate stringency at T_(m)−20° C. to T_(m)−35° C. and low stringencyat T_(m)−35° C. to T_(m)−50° C. for duplex >150 base pairs.Hybridization may be performed at low to moderate stringency (25-50° C.below T_(m)), followed by post-hybridization washes at increasingstringencies. Maximum rates of hybridization in solution are determinedempirically to occur at T_(m)−25° C. for DNA-DNA duplex and T_(m)−15° C.for RNA-DNA duplex. Optionally, the degree of dissociation may beassessed after each wash step to determine the need for subsequent,higher stringency wash steps.

High stringency conditions may be used to select for nucleic acidsequences with high degrees of identity to the disclosed sequences. Anexample of stringent hybridization conditions obtained in a filter-basedmethod such as a Southern or northern blot for hybridization ofcomplementary nucleic acids that have more than 100 complementaryresidues is about 5° C. to 20° C. lower than the thermal melting point(T_(m)) for the specific sequence at a defined ionic strength and pH.Conditions used for hybridization may include about 0.02 M to about 0.15M sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS orabout 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M sodiumcitrate, at hybridization temperatures between about 50° C. and about70° C. More preferably, high stringency conditions are about 0.02 Msodium chloride, about 0.5% casein, about 0.02% SDS, about 0.001 Msodium citrate, at a temperature of about 50° C. Nucleic acid moleculesthat hybridize under stringent conditions will typically hybridize to aprobe based on either the entire DNA molecule or selected portions,e.g., to a unique subsequence, of the DNA.

Stringent salt concentration will ordinarily be less than about 750 mMNaCl and 75 mM trisodium citrate. Increasingly stringent conditions maybe obtained with less than about 500 mM NaCl and 50 mM trisodiumcitrate, to even greater stringency with less than about 250 mM NaCl and25 mM trisodium citrate. Low stringency hybridization can be obtained inthe absence of organic solvent, e.g., formamide, whereas high stringencyhybridization may be obtained in the presence of at least about 35%formamide, and more preferably at least about 50% formamide. Stringenttemperature conditions will ordinarily include temperatures of at leastabout 30° C., more preferably of at least about 37° C., and mostpreferably of at least about 42° C. with formamide present. Varyingadditional parameters, such as hybridization time, the concentration ofdetergent, e.g., sodium dodecyl sulfate (SDS) and ionic strength, arewell known to those skilled in the art. Various levels of stringency areaccomplished by combining these various conditions as needed. In apreferred embodiment, hybridization will occur at 30° C. in 750 mM NaCl,75 mM trisodium citrate, and 1% SDS. In a more preferred embodiment,hybridization will occur at 37° C. in 500 mM NaCl, 50 mM trisodiumcitrate, 1% SDS, 35% formamide. In a most preferred embodiment,hybridization will occur at 42° C. in 250 mM NaCl, 25 mM trisodiumcitrate, 1% SDS, 50% formamide. Useful variations on these conditionswill be readily apparent to those skilled in the art.

The washing steps that follow hybridization may also vary in stringency;the post-hybridization wash steps primarily determine hybridizationspecificity, with the most critical factors being temperature and theionic strength of the final wash solution. Wash stringency can beincreased by decreasing salt concentration or by increasing temperature.Stringent salt concentration for the wash steps will preferably be lessthan about 30 mM NaCl and 3 mM trisodium citrate, and most preferablyless than about 15 mM NaCl and 1.5 mM trisodium citrate. For example,the wash conditions may be under conditions of 0.1×SSC to 2.0×SSC and0.1% SDS at 50-65° C., with, for example, two steps of 10-30 min. Oneexample of stringent wash conditions includes about 2.0×SSC, 0.1% SDS at65° C. and washing twice, each wash step being about 30 min. A higherstringency wash is about 0.2×SSC, 0.1% SDS at 65° C. and washing twicefor 30 min. A still higher stringency wash is about 0.1×SSC, 0.1% SDS at65° C. and washing twice for 30 min. The temperature for the washsolutions will ordinarily be at least about 25° C., and for greaterstringency at least about 42° C. Hybridization stringency may beincreased further by using the same conditions as in the hybridizationsteps, with the wash temperature raised about 3° C. to about 5° C., andstringency may be increased even further by using the same conditionsexcept the wash temperature is raised about 6° C. to about 9° C. Foridentification of less closely related homolog, wash steps may beperformed at a lower temperature, e.g., 50° C.

An example of a low stringency wash step employs a solution andconditions of at least 25° C. in 30 mM NaCl, 3 mM trisodium citrate, and0.1% SDS over 30 min. Greater stringency may be obtained at 42° C. in 15mM NaCl, with 1.5 mM trisodium citrate, and 0.1% SDS over 30 min. Evenhigher stringency wash conditions are obtained at 65° C.-68° C. in asolution of 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Washprocedures will generally employ at least two final wash steps.Additional variations on these conditions will be readily apparent tothose skilled in the art (see, for example, U.S. Patent Application No.20010010913).

Stringency conditions can be selected such that an oligonucleotide thatis perfectly complementary to the coding oligonucleotide hybridizes tothe coding oligonucleotide with at least about a 5-10× higher signal tonoise ratio than the ratio for hybridization of the perfectlycomplementary oligonucleotide to a nucleic acid encoding a transcriptionfactor known as of the filing date of the application. It may bedesirable to select conditions for a particular assay such that a highersignal to noise ratio, that is, about 15× or more, is obtained.Accordingly, a subject nucleic acid will hybridize to a unique codingoligonucleotide with at least a 2× or greater signal to noise ratio ascompared to hybridization of the coding oligonucleotide to a nucleicacid encoding known polypeptide. The particular signal will depend onthe label used in the relevant assay, e.g., a fluorescent label, acalorimetric label, a radioactive label, or the like. Labeledhybridization or PCR probes for detecting related polynucleotidesequences may be produced by oligolabeling, nick translation,end-labeling, or PCR amplification using a labeled nucleotide.

Identifying Polynucleotides or Nucleic Acids with Expression Libraries

In addition to hybridization methods, transcription factor homologpolypeptides can be obtained by screening an expression library usingantibodies specific for one or more transcription factors. With theprovision herein of the disclosed transcription factor, andtranscription factor homolog nucleic acid sequences, the encodedpolypeptide(s) can be expressed and purified in a heterologousexpression system (e.g., E. coli) and used to raise antibodies(monoclonal or polyclonal) specific for the polypeptide(s) in question.Antibodies can also be raised against synthetic peptides derived fromtranscription factor, or transcription factor homolog, amino acidsequences. Methods of raising antibodies are well known in the art andare described in Harlow and Lane (1988), Antibodies: A LaboratoryManual, Cold Spring Harbor Laboratory, New York. Such antibodies canthen be used to screen an expression library produced from the plantfrom which it is desired to clone additional transcription factorhomologs, using the methods described above. The selected cDNAs can beconfirmed by sequencing and enzymatic activity.

Sequence Variations

It will readily be appreciated by those of skill in the art, that any ofa variety of polynucleotide sequences are capable of encoding thetranscription factors and transcription factor homolog polypeptides ofthe invention. Due to the degeneracy of the genetic code, many differentpolynucleotides can encode identical and/or substantially similarpolypeptides in addition to those sequences illustrated in the SequenceListing. Nucleic acids having a sequence that differs from the sequencesshown in the Sequence Listing, or complementary sequences, that encodefunctionally equivalent peptides (i.e., peptides having some degree ofequivalent or similar biological activity) but differ in sequence fromthe sequence shown in the Sequence Listing due to degeneracy in thegenetic code, are also within the scope of the invention.

Altered polynucleotide sequences encoding polypeptides include thosesequences with deletions, insertions, or substitutions of differentnucleotides, resulting in a polynucleotide encoding a polypeptide withat least one functional characteristic of the instant polypeptides.Included within this definition are polymorphisms which may or may notbe readily detectable using a particular oligonucleotide probe of thepolynucleotide encoding the instant polypeptides, and improper orunexpected hybridization to allelic variants, with a locus other thanthe normal chromosomal locus for the polynucleotide sequence encodingthe instant polypeptides.

Allelic variant refers to any of two or more alternative forms of a geneoccupying the same chromosomal locus. Allelic variation arises naturallythrough mutation, and may result in phenotypic polymorphism withinpopulations. Gene mutations can be silent (i.e., no change in theencoded polypeptide) or may encode polypeptides having altered aminoacid sequence. The term allelic variant is also used herein to denote aprotein encoded by an allelic variant of a gene. Splice variant refersto alternative forms of RNA transcribed from a gene. Splice variationarises naturally through use of alternative splicing sites within atranscribed RNA molecule, or less commonly between separatelytranscribed RNA molecules, and may result in several mRNAs transcribedfrom the same gene. Splice variants may encode polypeptides havingaltered amino acid sequence. The term splice variant is also used hereinto denote a protein encoded by a splice variant of an mRNA transcribedfrom a gene.

Those skilled in the art would recognize that, for example, G481, SEQ IDNO: 88, represents a single transcription factor, allelic variation andalternative splicing may be expected to occur. Allelic variants of SEQID NO: 87 can be cloned by probing cDNA or genomic libraries fromdifferent individual organisms according to standard procedures. Allelicvariants of the DNA sequence shown in SEQ ID NO: 87, including thosecontaining silent mutations and those in which mutations result in aminoacid sequence changes, are within the scope of the present invention, asare proteins which are allelic variants of SEQ ID NO: 88. cDNAsgenerated from alternatively spliced mRNAs, which retain the propertiesof the transcription factor are included within the scope of the presentinvention, as are polypeptides encoded by such cDNAs and mRNAs. Allelicvariants and splice variants of these sequences can be cloned by probingcDNA or genomic libraries from different individual organisms or tissuesaccording to standard procedures known in the art (see U.S. Pat. No.6,388,064).

Thus, in addition to the sequences set forth in the Sequence Listing,the invention also encompasses related nucleic acid molecules thatinclude allelic or splice variants of SEQ ID NO: 2N−1, wherein N=1-229,SEQ ID NO: 459-466; 468-487; 491-500; 504; 506-511; 516-520; 523-524;527; 529; 531-533; 538-539; 541-557; 560-568; 570-586; 595-596; 598-606;610-620; 627-634; 640-664; 670-707; 714-719; 722-735; 740-741; 743-779;808-823; 825-834; 838-850; 855-864; 868-889; 892-902; 908-909; 914-921;924-925; 927-932; 935-942; 944-952; 961-965; 968-986; 989-993; 995-1010;1012-1034; 1043-1063; 1074-1080; 1091-1104; 1111-1121; 1123-1128;1134-1138; 1142-1156; 1159-1175; 1187-1190; 1192-1199; 1202-1220;1249-1253; 1258-1262; 1264-1269; 1271-1287; 1292-1301; 1303-1309;1315-1323; 1328-1337; 1340-1341; 1344-1361; 1365-1377; 1379-1390;1393-1394; 1396-1398; 1419-1432; 1434-1452; 1455-1456; 1460-1465;1468-1491; 1499; 1502; 1505-1521; 1523-1527; 1529-1532; 1536-1539;1542-1562; 1567-1571; 1573-1582; 1587-1592; 1595-1620; 1625-1644;1647-1654; 1659-1669; 1671-1673; 1675-1680; 1682-1686; 1688-1700;1706-1709; 1714-1726; 1728-1734; 1738-1742; 1744-1753; 1757-1760;1763-1764; 1766-1768; 1770-1780; 1782-1784; 1786-1789; 1791-1804;1806-1812; 1814-1837; 1847-1856; 1858-1862; 1864-1873; 1876-1882;1885-1896; 1902-1910; 1913-1916; 1921-1928; 1931-1936; 1940-1941;1944-1946, 2907-2941, 2944, 2945, 2947, 2949, or SEQ ID NO: 2N−1,wherein N=974-1101, and include sequences which are complementary to anyof the above nucleotide sequences. Related nucleic acid molecules alsoinclude nucleotide sequences encoding a polypeptide comprising orconsisting essentially of a substitution, modification, addition and/ordeletion of one or more amino acid residues compared to the polypeptideas set forth in any of SEQ ID NO: 2N, wherein N=1-229, SEQ ID NO: 467;488-490; 501-503; 505; 512-515; 521-522; 525-526; 528; 530; 534-537;540; 558-559; 569; 587-594; 597; 607-609; 621-626; 635-639; 665-669;708-713; 720-721; 736-739; 742; 780-807; 824; 835-837; 851-854; 865-867;890-891; 903-907; 910-913; 922-923; 926; 933-934; 943; 953-960; 966-967;987-988; 994; 1011; 1035-1042; 1064-1073; 1081-1090; 1105-1110; 1122;1129-1133; 1139-1141; 1157-1158; 1176-1186; 1191; 1200-1201; 1221-1248;1254-1257; 1263; 1270; 1288-1291; 1302; 1310-1314; 1324-1327; 1338-1339;1342-1343; 1362-1364; 1378; 1391-1392; 1395; 1399-1418; 1433; 1453-1454;1457-1459; 1466-1467; 1492-1498; 1500-1501; 1503-1504; 1522; 1528;1533-1535; 1540-1541; 1563-1566; 1572; 1583-1586; 1593-1594; 1621-1624;1645-1646; 1655-1658; 1670; 1674; 1681; 1687; 1701-1705; 1710-1713;1727; 1735-1737; 1743; 1754-1756; 1761-1762; 1765; 1769; 1781; 1785;1790; 1805; 1813; 1838-1846; 1857; 1863; 1874-1875; 1883-1884;1897-1901; 1911-1912; 1917-1920; 1929-1930; 1937-1939; 1942-1943; 2942or 2943, 2945, 2947, 2949, or SEQ ID NO: 2N, wherein N=974-1101. Suchrelated polypeptides may comprise, for example, additions and/ordeletions of one or more N-linked or O-linked glycosylation sites, or anaddition and/or a deletion of one or more cysteine residues.

For example, Table 4 illustrates, e.g., that the codons AGC, AGT, TCA,TCC, TCG, and TCT all encode the same amino acid: serine. Accordingly,at each position in the sequence where there is a codon encoding serine,any of the above trinucleotide sequences can be used without alteringthe encoded polypeptide.

TABLE 4 Amino acid Possible Codons Alanine Ala A GCA GCC GCG GCUCysteine Cys C TGC TGT Aspartic acid Asp D GAC GAT Glutamic acid Glu EGAA GAG Phenylalanine Phe F TTC TTT Glycine Gly G GGA GGC GGG GGTHistidine His H CAC CAT Isoleucine Ile I ATA ATC ATT Lysine Lys K AAAAAG Leucine Leu L TTA TTG CTA CTC CTG CTT Methionine Met M ATGAsparagine Asn N AAC AAT Proline Pro P CCA CCC CCG CCT Glutamine Gln QCAA CAG Arginine Arg R AGA AGG CGA CGC CGG CGT Serine Ser S AGC AGT TCATCC TCG TCT Threonine Thr T ACA ACC ACG ACT Valine Val V GTA GTC GTG GTTTryptophan Trp W TGG Tyrosine Tyr Y TAC TAT

Sequence alterations that do not change the amino acid sequence encodedby the polynucleotide are termed “silent” variations. With the exceptionof the codons ATG and TGG, encoding methionine and tryptophan,respectively, any of the possible codons for the same amino acid can besubstituted by a variety of techniques, e.g., site-directed mutagenesis,available in the art. Accordingly, any and all such variations of asequence selected from the above table are a feature of the invention.

In addition to silent variations, other conservative variations thatalter one, or a few amino acids in the encoded polypeptide, can be madewithout altering the function of the polypeptide, these conservativevariants are, likewise, a feature of the invention.

For example, substitutions, deletions and insertions introduced into thesequences provided in the Sequence Listing, are also envisioned by theinvention. Such sequence modifications can be engineered into a sequenceby site-directed mutagenesis (Wu (ed.) Methods Enzymol. (1993) vol. 217,Academic Press) or the other methods noted below. Amino acidsubstitutions are typically of single residues; insertions usually willbe on the order of about from 1 to 10 amino acid residues; and deletionswill range about from 1 to 30 residues. In preferred embodiments,deletions or insertions are made in adjacent pairs, e.g., a deletion oftwo residues or insertion of two residues. Substitutions, deletions,insertions or any combination thereof can be combined to arrive at asequence. The mutations that are made in the polynucleotide encoding thetranscription factor should not place the sequence out of reading frameand should not create complementary regions that could produce secondarymRNA structure. Preferably, the polypeptide encoded by the DNA performsthe desired function.

Conservative substitutions are those in which at least one residue inthe amino acid sequence has been removed and a different residueinserted in its place. Such substitutions generally are made inaccordance with the Table 5 when it is desired to maintain the activityof the protein. Table 5 shows amino acids which can be substituted foran amino acid in a protein and which are typically regarded asconservative substitutions.

TABLE 5 Conservative Residue Substitutions Ala Ser Arg Lys Asn Gln; HisAsp Glu Gln Asn Cys Ser Glu Asp Gly Pro His Asn; Gln Ile Leu, Val LeuIle; Val Lys Arg; Gln Met Leu; Ile Phe Met; Leu; Tyr Ser Thr; Gly ThrSer; Val Trp Tyr Tyr Trp; Phe Val Ile; Leu

Similar substitutions are those in which at least one residue in theamino acid sequence has been removed and a different residue inserted inits place. Such substitutions generally are made in accordance with theTable 6 when it is desired to maintain the activity of the protein.Table 6 shows amino acids which can be substituted for an amino acid ina protein and which are typically regarded as structural and functionalsubstitutions. For example, a residue in column 1 of Table 6 may besubstituted with a residue in column 2; in addition, a residue in column2 of Table 6 may be substituted with the residue of column 1.

TABLE 6 Residue Similar Substitutions Ala Ser; Thr; Gly; Val; Leu; IleArg Lys; His; Gly Asn Gln; His; Gly; Ser; Thr Asp Glu, Ser; Thr Gln Asn;Ala Cys Ser; Gly Glu Asp Gly Pro; Arg His Asn; Gln; Tyr; Phe; Lys; ArgIle Ala; Leu; Val; Gly; Met Leu Ala; Ile; Val; Gly; Met Lys Arg; His;Gln; Gly; Pro Met Leu; Ile; Phe Phe Met; Leu; Tyr; Trp; His; Val; AlaSer Thr; Gly; Asp; Ala; Val; Ile; His Thr Ser; Val; Ala; Gly Trp Tyr;Phe; His Tyr Trp; Phe; His Val Ala; Ile; Leu; Gly; Thr; Ser; Glu

Substitutions that are less conservative than those in Table 5 can beselected by picking residues that differ more significantly in theireffect on maintaining (a) the structure of the polypeptide backbone inthe area of the substitution, for example, as a sheet or helicalconformation, (b) the charge or hydrophobicity of the molecule at thetarget site, or (c) the bulk of the side chain. The substitutions whichin general are expected to produce the greatest changes in proteinproperties will be those in which (a) a hydrophilic residue, e.g., serylor threonyl, is substituted for (or by) a hydrophobic residue, e.g.,leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine orproline is substituted for (or by) any other residue; (c) a residuehaving an electropositive side chain, e.g., lysyl, arginyl, or histidyl,is substituted for (or by) an electronegative residue, e.g., glutamyl oraspartyl; or (d) a residue having a bulky side chain, e.g.,phenylalanine, is substituted for (or by) one not having a side chain,e.g., glycine.

Further Modifying Sequences of the Invention—Mutation/Forced Evolution

In addition to generating silent or conservative substitutions as noted,above, the present invention optionally includes methods of modifyingthe sequences of the Sequence Listing. In the methods, nucleic acid orprotein modification methods are used to alter the given sequences toproduce new sequences and/or to chemically or enzymatically modify givensequences to change the properties of the nucleic acids or proteins.

Thus, in one embodiment, given nucleic acid sequences are modified,e.g., according to standard mutagenesis or artificial evolution methodsto produce modified sequences. The modified sequences may be createdusing purified natural polynucleotides isolated from any organism or maybe synthesized from purified compositions and chemicals using chemicalmeans well know to those of skill in the art. For example, Ausubel,supra, provides additional details on mutagenesis methods. Artificialforced evolution methods are described, for example, by Stemmer (1994)Nature 370: 389-391, Stemmer (1994) Proc. Natl. Acad. Sci. 91:10747-10751, and U.S. Pat. Nos. 5,811,238, 5,837,500, and 6,242,568.Methods for engineering synthetic transcription factors and otherpolypeptides are described, for example, by Zhang et al. (2000) J. Biol.Chem. 275: 33850-33860, Liu et al. (2001) J. Biol. Chem. 276:11323-11334, and Isalan et al. (2001) Nature Biotechnol. 19: 656-660.Many other mutation and evolution methods are also available andexpected to be within the skill of the practitioner.

Similarly, chemical or enzymatic alteration of expressed nucleic acidsand polypeptides can be performed by standard methods. For example,sequence can be modified by addition of lipids, sugars, peptides,organic or inorganic compounds, by the inclusion of modified nucleotidesor amino acids, or the like. For example, protein modificationtechniques are illustrated in Ausubel, supra. Further details onchemical and enzymatic modifications can be found herein. Thesemodification methods can be used to modify any given sequence, or tomodify any sequence produced by the various mutation and artificialevolution modification methods noted herein.

Accordingly, the invention provides for modification of any givennucleic acid by mutation, evolution, chemical or enzymatic modification,or other available methods, as well as for the products produced bypracticing such methods, e.g., using the sequences herein as a startingsubstrate for the various modification approaches.

For example, optimized coding sequence containing codons preferred by aparticular prokaryotic or eukaryotic host can be used e.g., to increasethe rate of translation or to produce recombinant RNA transcripts havingdesirable properties, such as a longer half-life, as compared withtranscripts produced using a non-optimized sequence. Translation stopcodons can also be modified to reflect host preference. For example,preferred stop codons for Saccharomyces cerevisiae and mammals are TAAand TGA, respectively. The preferred stop codon for monocotyledonousplants is TGA, whereas insects and E. coli prefer to use TAA as the stopcodon.

The polynucleotide sequences of the present invention can also beengineered in order to alter a coding sequence for a variety of reasons,including but not limited to, alterations which modify the sequence tofacilitate cloning, processing and/or expression of the gene product.For example, alterations are optionally introduced using techniqueswhich are well known in the art, e.g., site-directed mutagenesis, toinsert new restriction sites, to alter glycosylation patterns, to changecodon preference, to introduce splice sites, etc.

Furthermore, a fragment or domain derived from any of the polypeptidesof the invention can be combined with domains derived from othertranscription factors or synthetic domains to modify the biologicalactivity of a transcription factor. For instance, a DNA-binding domainderived from a transcription factor of the invention can be combinedwith the activation domain of another transcription factor or with asynthetic activation domain. A transcription activation domain assistsin initiating transcription from a DNA-binding site. Examples includethe transcription activation region of VP16 or GAL4 (Moore et al. (1998)Proc. Natl. Acad. Sci. 95: 376-381; Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from bacterial sequences (Ma and Ptashne(1987) Cell 51: 113-119) and synthetic peptides (Giniger and Ptashne(1987) Nature 330: 670-672).

Expression and Modification of Polypeptides

Typically, polynucleotide sequences of the invention are incorporatedinto recombinant DNA (or RNA) molecules that direct expression ofpolypeptides of the invention in appropriate host cells, transgenicplants, in vitro translation systems, or the like. Due to the inherentdegeneracy of the genetic code, nucleic acid sequences which encodesubstantially the same or a functionally equivalent amino acid sequencecan be substituted for any listed sequence to provide for cloning andexpressing the relevant homolog.

The transgenic plants of the present invention comprising recombinantpolynucleotide sequences are generally derived from parental plants,which may themselves be non-transformed (or non-transgenic) plants.These transgenic plants may either have a transcription factor gene“knocked out” (for example, with a genomic insertion by homologousrecombination, an antisense or ribozyme construct) or expressed to anormal or wild-type extent. However, overexpressing transgenic “progeny”plants will exhibit greater mRNA levels, wherein the mRNA encodes atranscription factor, that is, a DNA-binding protein that is capable ofbinding to a DNA regulatory sequence and inducing transcription, andpreferably, expression of a plant trait gene. Preferably, the mRNAexpression level will be at least three-fold greater than that of theparental plant, or more preferably at least ten-fold greater mRNA levelscompared to said parental plant, and most preferably at least fifty-foldgreater compared to said parental plant.

Vectors, Promoters, and Expression Systems

The present invention includes recombinant constructs comprising one ormore of the nucleic acid sequences herein. The constructs typicallycomprise a vector, such as a plasmid, a cosmid, a phage, a virus (e.g.,a plant virus), a bacterial artificial chromosome (BAC), a yeastartificial chromosome (YAC), or the like, into which a nucleic acidsequence of the invention has been inserted, in a forward or reverseorientation. In a preferred aspect of this embodiment, the constructfurther comprises regulatory sequences, including, for example, apromoter, operably linked to the sequence. Large numbers of suitablevectors and promoters are known to those of skill in the art, and arecommercially available.

General texts that describe molecular biological techniques usefulherein, including the use and production of vectors, promoters and manyother relevant topics, include Berger, Sambrook, supra and Ausubel,supra. Any of the identified sequences can be incorporated into acassette or vector, e.g., for expression in plants. A number ofexpression vectors suitable for stable transformation of plant cells orfor the establishment of transgenic plants have been described includingthose described in Weissbach and Weissbach (1989) Methods for PlantMolecular Biology, Academic Press, and Gelvin et al. (1990) PlantMolecular Biology Manual, Kluwer Academic Publishers. Specific examplesinclude those derived from a Ti plasmid of Agrobacterium tumefaciens, aswell as those disclosed by Herrera-Estrella et al. (1983) Nature 303:209, Bevan (1984) Nucleic Acids Res. 12: 8711-8721, Klee (1985)Bio/Technology 3: 637-642, for dicotyledonous plants.

Alternatively, non-Ti vectors can be used to transfer the DNA intomonocotyledonous plants and cells by using free DNA delivery techniques.Such methods can involve, for example, the use of liposomes,electroporation, microprojectile bombardment, silicon carbide whiskers,and viruses. By using these methods transgenic plants such as wheat,rice (Christou (1991) Bio/Technology 9: 957-962) and corn (Gordon-Kamm(1990) Plant Cell 2: 603-618) can be produced. An immature embryo canalso be a good target tissue for monocots for direct DNA deliverytechniques by using the particle gun (Weeks et al. (1993) Plant Physiol.102: 1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux(1994) Plant Physiol. 104: 37-48, and for Agrobacterium-mediated DNAtransfer (Ishida et al. (1996) Nature Biotechnol. 14: 745-750).

Typically, plant transformation vectors include one or more cloned plantcoding sequence (genomic or cDNA) under the transcriptional control of5′ and 3′ regulatory sequences and a dominant selectable marker. Suchplant transformation vectors typically also contain a promoter (e.g., aregulatory region controlling inducible or constitutive,environmentally- or developmentally-regulated, or cell- ortissue-specific expression), a transcription initiation start site, anRNA processing signal (such as intron splice sites), a transcriptiontermination site, and/or a polyadenylation signal.

A potential utility for the transcription factor polynucleotidesdisclosed herein is the isolation of promoter elements from these genesthat can be used to program expression in plants of any genes. Eachtranscription factor gene disclosed herein is expressed in a uniquefashion, as determined by promoter elements located upstream of thestart of translation, and additionally within an intron of thetranscription factor gene or downstream of the termination codon of thegene. As is well known in the art, for a significant portion of genes,the promoter sequences are located entirely in the region directlyupstream of the start of translation. In such cases, typically thepromoter sequences are located within 2.0 kb of the start oftranslation, or within 1.5 kb of the start of translation, frequentlywithin 1.0 kb of the start of translation, and sometimes within 0.5 kbof the start of translation.

The promoter sequences can be isolated according to methods known to oneskilled in the art.

Examples of constitutive plant promoters which can be useful forexpressing the TF sequence include: the cauliflower mosaic virus (CaMV)35S promoter, which confers constitutive, high-level expression in mostplant tissues (see, e.g., Odell et al. (1985) Nature 313: 810-812); thenopaline synthase promoter (An et al. (1988) Plant Physiol. 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) PlantCell 1: 977-984).

A variety of plant gene promoters that regulate gene expression inresponse to environmental, hormonal, chemical, developmental signals,and in a tissue-active manner can be used for expression of a TFsequence in plants. Choice of a promoter is based largely on thephenotype of interest and is determined by such factors as tissue (e.g.,seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.),inducibility (e.g., in response to wounding, heat, cold, drought, light,pathogens, etc.), timing, developmental stage, and the like. Numerousknown promoters have been characterized and can favorably be employed topromote expression of a polynucleotide of the invention in a transgenicplant or cell of interest. For example, tissue specific promotersinclude: seed-specific promoters (such as the napin, phaseolin or DC3promoter described in U.S. Pat. No. 5,773,697), fruit-specific promotersthat are active during fruit ripening (such as the dru 1 promoter (U.S.Pat. No. 5,783,393), or the 2A11 promoter (U.S. Pat. No. 4,943,674) andthe tomato polygalacturonase promoter (Bird et al. (1988) Plant Mol.Biol. 11: 651-662), root-specific promoters, such as those disclosed inU.S. Pat. Nos. 5,618,988, 5,837,848 and 5,905,186, pollen-activepromoters such as PTA29, PTA26 and PTA13 (U.S. Pat. No. 5,792,929),promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol.Biol. 37: 977-988), flower-specific (Kaiser et al. (1995) Plant Mol.Biol. 28: 231-243), pollen (Baerson et al. (1994) Plant Mol. Biol. 26:1947-1959), carpels (Ohl et al. (1990) Plant Cell 2: 837-848), pollenand ovules (Baerson et al. (1993) Plant Mol. Biol. 22: 255-267),auxin-inducible promoters (such as that described in van der Kop et al.(1999) Plant Mol. Biol. 39: 979-990 or Baumann et al. (1999) Plant Cell11: 323-334), cytokinin-inducible promoter (Guevara-Garcia (1998) PlantMol. Biol. 38: 743-753), promoters responsive to gibberellin (Shi et al.(1998) Plant Mol. Biol. 38: 1053-1060, Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that elicitexpression in response to heat (Ainley et al. (1993) Plant Mol. Biol.22: 13-23), light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al.(1989) Plant Cell 1: 471-478, and the maize rbcS promoter, Schaffner andSheen (1991) Plant Cell 3: 997-1012); wounding (e.g., wunI, Siebertz etal. (1989) Plant Cell 1: 961-968); pathogens (such as the PR-1 promoterdescribed in Buchel et al. (1999) Plant Mol. Biol. 40: 387-396, and thePDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-1080), and chemicals such as methyl jasmonate or salicylic acid(Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol. Biol. 48: 89-108). Inaddition, the timing of the expression can be controlled by usingpromoters such as those acting at senescence (Gan and Amasino (1995)Science 270: 1986-1988); or late seed development (Odell et al. (1994)Plant Physiol. 106: 447-458).

Plant expression vectors can also include RNA processing signals thatcan be positioned within, upstream or downstream of the coding sequence.In addition, the expression vectors can include additional regulatorysequences from the 3′-untranslated region of plant genes, e.g., a 3′terminator region to increase mRNA stability of the mRNA, such as thePI-II terminator region of potato or the octopine or nopaline synthase3′ terminator regions.

Additional Expression Elements

Specific initiation signals can aid in efficient translation of codingsequences. These signals can include, e.g., the ATG initiation codon andadjacent sequences. In cases where a coding sequence, its initiationcodon and upstream sequences are inserted into the appropriateexpression vector, no additional translational control signals may beneeded. However, in cases where only coding sequence (e.g., a matureprotein coding sequence), or a portion thereof, is inserted, exogenoustranscriptional control signals including the ATG initiation codon canbe separately provided. The initiation codon is provided in the correctreading frame to facilitate transcription. Exogenous transcriptionalelements and initiation codons can be of various origins, both naturaland synthetic. The efficiency of expression can be enhanced by theinclusion of enhancers appropriate to the cell system in use.

Expression Hosts

The present invention also relates to host cells which are transducedwith vectors of the invention, and the production of polypeptides of theinvention (including fragments thereof) by recombinant techniques. Hostcells are genetically engineered (i.e., nucleic acids are introduced,e.g., transduced, transformed or transfected) with the vectors of thisinvention, which may be, for example, a cloning vector or an expressionvector comprising the relevant nucleic acids herein. The vector isoptionally a plasmid, a viral particle, a phage, a naked nucleic acid,etc. The engineered host cells can be cultured in conventional nutrientmedia modified as appropriate for activating promoters, selectingtransformants, or amplifying the relevant gene. The culture conditions,such as temperature, pH and the like, are those previously used with thehost cell selected for expression, and will be apparent to those skilledin the art and in the references cited herein, including, Sambrook,supra and Ausubel, supra.

The host cell can be a eukaryotic cell, such as a yeast cell, or a plantcell, or the host cell can be a prokaryotic cell, such as a bacterialcell. Plant protoplasts are also suitable for some applications. Forexample, the DNA fragments are introduced into plant tissues, culturedplant cells or plant protoplasts by standard methods includingelectroporation (Fromm et al. (1985) Proc. Natl. Acad. Sci. 82:5824-5828, infection by viral vectors such as cauliflower mosaic virus(CaMV) (Hohn et al. (1982) Molecular Biology of Plant Tumors AcademicPress, New York, N.Y., pp. 549-560; U.S. Pat. No. 4,407,956), highvelocity ballistic penetration by small particles with the nucleic acideither within the matrix of small beads or particles, or on the surface(Klein et al. (1987) Nature 327: 70-73), use of pollen as vector (WO85/01856), or use of Agrobacterium tumefaciens or A. rhizogenes carryinga T-DNA plasmid in which DNA fragments are cloned. The T-DNA plasmid istransmitted to plant cells upon infection by Agrobacterium tumefaciens,and a portion is stably integrated into the plant genome (Horsch et al.(1984) Science 233: 496-498; Fraley et al. (1983) Proc. Natl. Acad. Sci.80: 4803-4807).

The cell can include a nucleic acid of the invention that encodes apolypeptide, wherein the cell expresses a polypeptide of the invention.The cell can also include vector sequences, or the like. Furthermore,cells and transgenic plants that include any polypeptide or nucleic acidabove or throughout this specification, e.g., produced by transductionof a vector of the invention, are an additional feature of theinvention.

For long-term, high-yield production of recombinant proteins, stableexpression can be used. Host cells transformed with a nucleotidesequence encoding a polypeptide of the invention are optionally culturedunder conditions suitable for the expression and recovery of the encodedprotein from cell culture. The protein or fragment thereof produced by arecombinant cell may be secreted, membrane-bound, or containedintracellularly, depending on the sequence and/or the vector used. Aswill be understood by those of skill in the art, expression vectorscontaining polynucleotides encoding mature proteins of the invention canbe designed with signal sequences which direct secretion of the maturepolypeptides through a prokaryotic or eukaryotic cell membrane.

Modified Amino Acid Residues

Polypeptides of the invention may contain one or more modified aminoacid residues. The presence of modified amino acids may be advantageousin, for example, increasing polypeptide half-life, reducing polypeptideantigenicity or toxicity, increasing polypeptide storage stability, orthe like. Amino acid residue(s) are modified, for example,co-translationally or post-translationally during recombinant productionor modified by synthetic or chemical means.

Non-limiting examples of a modified amino acid residue includeincorporation or other use of acetylated amino acids, glycosylated aminoacids, sulfated amino acids, prenylated (e.g., farnesylated,geranylgeranylated) amino acids, PEG modified (e.g., “TEGylated”) aminoacids, biotinylated amino acids, carboxylated amino acids,phosphorylated amino acids, etc. References adequate to guide one ofskill in the modification of amino acid residues are replete throughoutthe literature.

The modified amino acid residues may prevent or increase affinity of thepolypeptide for another molecule, including, but not limited to,polynucleotide, proteins, carbohydrates, lipids and lipid derivatives,and other organic or synthetic compounds.

Identification of Additional Factors

A transcription factor provided by the present invention can also beused to identify additional endogenous or exogenous molecules that canaffect a phentoype or trait of interest. On the one hand, such moleculesinclude organic (small or large molecules) and/or inorganic compoundsthat affect expression of (i.e., regulate) a particular transcriptionfactor. Alternatively, such molecules include endogenous molecules thatare acted upon either at a transcriptional level by a transcriptionfactor of the invention to modify a phenotype as desired. For example,the transcription factors can be employed to identify one or moredownstream genes that are subject to a regulatory effect of thetranscription factor. In one approach, a transcription factor ortranscription factor homolog of the invention is expressed in a hostcell, e.g., a transgenic plant cell, tissue or explant, and expressionproducts, either RNA or protein, of likely or random-targets aremonitored, e.g., by hybridization to a microarray of nucleic acid probescorresponding to genes expressed in a tissue or cell type of interest,by two-dimensional gel electrophoresis of protein products, or by anyother method known in the art for assessing expression of gene productsat the level of RNA or protein. Alternatively, a transcription factor ofthe invention can be used to identify promoter sequences (such asbinding sites on DNA sequences) involved in the regulation of adownstream target. After identifying a promoter sequence, interactionsbetween the transcription factor and the promoter sequence can bemodified by changing specific nucleotides in the promoter sequence orspecific amino acids in the transcription factor that interact with thepromoter sequence to alter a plant trait. Typically, transcriptionfactor DNA-binding sites are identified by gel shift assays. Afteridentifying the promoter regions, the promoter region sequences can beemployed in double-stranded DNA arrays to identify molecules that affectthe interactions of the transcription factors with their promoters(Bulyk et al. (1999) Nature Biotechnol. 17: 573-577).

The identified transcription factors are also useful to identifyproteins that modify the activity of the transcription factor. Suchmodification can occur by covalent modification, such as byphosphorylation, or by protein-protein (homo or -heteropolymer)interactions. Any method suitable for detecting protein-proteininteractions can be employed. Among the methods that can be employed areco-immunoprecipitation, cross-linking and co-purification throughgradients or chromatographic columns, and the two-hybrid yeast system.

The two-hybrid system detects protein interactions in vivo and isdescribed in Chien et al. (1991) Proc. Natl. Acad. Sci. 88: 9578-9582,and is commercially available from Clontech (Palo Alto, Calif.). In sucha system, plasmids are constructed that encode two hybrid proteins: oneconsists of the DNA-binding domain of a transcription activator proteinfused to the TF polypeptide and the other consists of the transcriptionactivator protein's activation domain fused to an unknown protein thatis encoded by a cDNA that has been recombined into the plasmid as partof a cDNA library. The DNA-binding domain fusion plasmid and the cDNAlibrary are transformed into a strain of the yeast Saccharomycescerevisiae that contains a reporter gene (e.g., lacZ) whose regulatoryregion contains the transcription activator's binding site. Eitherhybrid protein alone cannot activate transcription of the reporter gene.Interaction of the two hybrid proteins reconstitutes the functionalactivator protein and results in expression of the reporter gene, whichis detected by an assay for the reporter gene product. Then, the libraryplasmids responsible for reporter gene expression are isolated andsequenced to identity the proteins encoded by the library plasmids.After identifying proteins that interact with the transcription factors,assays for compounds that interfere with the TF protein-proteininteractions can be preformed.

Identification of Modulators

In addition to the intracellular molecules described above,extracellular molecules that alter activity or expression of atranscription factor, either directly or indirectly, can be identified.For example, the methods can entail first placing a candidate moleculein contact with a plant or plant cell. The molecule can be introduced bytopical administration, such as spraying or soaking of a plant, orincubating a plant in a solution containing the molecule, and then themolecule's effect on the expression or activity of the TF polypeptide orthe expression of the polynucleotide monitored. Changes in theexpression of the TF polypeptide can be monitored by use of polyclonalor monoclonal antibodies, gel electrophoresis or the like. Changes inthe expression of the corresponding polynucleotide sequence can bedetected by use of microarrays, Northerns, quantitative PCR, or anyother technique for monitoring changes in mRNA expression. Thesetechniques are exemplified in Ausubel et al. (eds.) Current Protocols inMolecular Biology, John Wiley & Sons (1998, and supplements through2001). Changes in the activity of the transcription factor can bemonitored, directly or indirectly, by assaying the function of thetranscription factor, for example, by measuring the expression ofpromoters known to be controlled by the transcription factor (usingpromoter-reporter constructs), measuring the levels of transcripts usingmicroarrays, Northern blots, quantitative PCR, etc. Such changes in theexpression levels can be correlated with modified plant traits and thusidentified molecules can be useful for soaking or spraying on fruit,vegetable and grain crops to modify traits in plants. Essentially anyavailable composition can be tested for modulatory activity ofexpression or activity of any nucleic acid or polypeptide herein. Thus,available libraries of compounds such as chemicals, polypeptides,nucleic acids and the like can be tested for modulatory activity. Often,potential modulator compounds can be dissolved in aqueous or organic(e.g., DMSO-based) solutions for easy delivery to the cell or plant ofinterest in which the activity of the modulator is to be tested.Optionally, the assays are designed to screen large modulatorcomposition libraries by automating the assay steps and providingcompounds from any convenient source to assays, which are typically runin parallel (e.g., in microtiter formats on microplates in roboticassays).

In one embodiment, high throughput screening methods involve providing acombinatorial library containing a large number of potential compounds(potential modulator compounds). Such “combinatorial chemical libraries”are then screened in one or more assays, as described herein, toidentify those library members particular chemical species orsubclasses) that display a desired characteristic activity. Thecompounds thus identified can serve as target compounds.

A combinatorial chemical library can be, e.g., a collection of diversechemical compounds generated by chemical synthesis or biologicalsynthesis. For example, a combinatorial chemical library such as apolypeptide library is formed by combining a set of chemical buildingblocks (e.g., in one example, amino acids) in every possible way for agiven compound length (i.e., the number of amino acids in a polypeptidecompound of a set length). Exemplary libraries include peptidelibraries, nucleic acid libraries, antibody libraries (see, e.g., Vaughnet al. (1996) Nature Biotechnol. 14: 309-314 and PCT/US96/10287),carbohydrate libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Pat. No. 5,593,853), peptide nucleic acid libraries(see, e.g., U.S. Pat. No. 5,539,083), and small organic moleculelibraries (see, e.g., benzodiazepines, in Baum Chem. & Engineering NewsJan. 18, 1993, page 33; isoprenoids, U.S. Pat. No. 5,569,588;thiazolidinones and metathiazanones, U.S. Pat. No. 5,549,974;pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholinocompounds, U.S. Pat. No. 5,506,337) and the like.

Preparation and screening of combinatorial or other libraries is wellknown to those of skill in the art. Such combinatorial chemicallibraries include, but are not limited to, peptide libraries (see, e.g.,U.S. Pat. No. 5,010,175; Furka, (1991) Int. J. Pept. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354: 84-88). Otherchemistries for generating chemical diversity libraries can also beused.

In addition, as noted, compound screening equipment for high-throughputscreening is generally available, e.g., using any of a number of wellknown robotic systems that have also been developed for solution phasechemistries useful in assay systems. These systems include automatedworkstations including an automated synthesis apparatus and roboticsystems utilizing robotic arms. Any of the above devices are suitablefor use with the present invention, e.g., for high-throughput screeningof potential modulators. The nature and implementation of modificationsto these devices (if any) so that they can operate as discussed hereinwill be apparent to persons skilled in the relevant art.

Indeed, entire high-throughput screening systems are commerciallyavailable. These systems typically automate entire procedures includingall sample and reagent pipetting, liquid dispensing, timed incubations,and final readings of the microplate in detector(s) appropriate for theassay. These configurable systems provide high throughput and rapidstart up as well as a high degree of flexibility and customization.Similarly, microfluidic implementations of screening are alsocommercially available.

The manufacturers of such systems provide detailed protocols the varioushigh throughput. Thus, for example, Zymark Corp. provides technicalbulletins describing screening systems for detecting the modulation ofgene transcription, ligand binding, and the like. The integrated systemsherein, in addition to providing for sequence alignment and, optionally,synthesis of relevant nucleic acids, can include such screeningapparatus to identify modulators that have an effect on one or morepolynucleotides or polypeptides according to the present invention.

In some assays it is desirable to have positive controls to ensure thatthe components of the assays are working properly. At least two types ofpositive controls are appropriate. That is, known transcriptionalactivators or inhibitors can be incubated with cells or plants, forexample, in one sample of the assay, and the resulting increase/decreasein transcription can be detected by measuring the resulting increase inRNA levels and/or protein expression, for example, according to themethods herein. It will be appreciated that modulators can also becombined with transcriptional activators or inhibitors to findmodulators that inhibit transcriptional activation or transcriptionalrepression. Either expression of the nucleic acids and proteins hereinor any additional nucleic acids or proteins activated by the nucleicacids or proteins herein, or both, can be monitored.

In an embodiment, the invention provides a method for identifyingcompositions that modulate the activity or expression of apolynucleotide or polypeptide of the invention. For example, a testcompound, whether a small or large molecule, is placed in contact with acell, plant (or plant tissue or explant), or composition comprising thepolynucleotide or polypeptide of interest and a resulting effect on thecell, plant, (or tissue or explant) or composition is evaluated bymonitoring, either directly or indirectly, one or more of: expressionlevel of the polynucleotide or polypeptide, activity (or modulation ofthe activity) of the polynucleotide or polypeptide. In some cases, analteration in a plant phenotype can be detected following contact of aplant (or plant cell, or tissue or explant) with the putative modulator,e.g., by modulation of expression or activity of a polynucleotide orpolypeptide of the invention. Modulation of expression or activity of apolynucleotide or polypeptide of the invention may also be caused bymolecular elements in a signal transduction second messenger pathway andsuch modulation can affect similar elements in the same or anothersignal transduction second messenger pathway.

Subsequences

Also contemplated are uses of polynucleotides, also referred to hereinas oligonucleotides, typically having at least 12 bases, preferably atleast 15, more preferably at least 20, 30, or 50 bases, which hybridizeunder at least highly stringent (or ultra-high stringent orultra-ultra-high stringent conditions) conditions to a polynucleotidesequence described above. The polynucleotides may be used as probes,primers, sense and antisense agents, and the like, according to methodsas noted supra.

Subsequences of the polynucleotides of the invention, includingpolynucleotide fragments and oligonucleotides are useful as nucleic acidprobes and primers. An oligonucleotide suitable for use as a probe orprimer is at least about 15 nucleotides in length, more often at leastabout 18 nucleotides, often at least about 21 nucleotides, frequently atleast about 30 nucleotides, or about 40 nucleotides, or more in length.A nucleic acid probe is useful in hybridization protocols, e.g., toidentify additional polypeptide homologs of the invention, includingprotocols for microarray experiments. Primers can be annealed to acomplementary target DNA strand by nucleic acid hybridization to form ahybrid between the primer and the target DNA strand, and then extendedalong the target DNA strand by a DNA polymerase enzyme. Primer pairs canbe used for amplification of a nucleic acid sequence, e.g., by thepolymerase chain reaction (PCR) or other nucleic-acid amplificationmethods. See Sambrook, supra, and Ausubel, supra.

In addition, the invention includes an isolated or recombinantpolypeptide including a subsequence of at least about 15 contiguousamino acids encoded by the recombinant or isolated polynucleotides ofthe invention. For example, such polypeptides, or domains or fragmentsthereof, can be used as immunogens, e.g., to produce antibodies specificfor the polypeptide sequence, or as probes for detecting a sequence ofinterest. A subsequence can range in size from about 15 amino acids inlength up to and including the full length of the polypeptide.

To be encompassed by the present invention, an expressed polypeptidewhich comprises such a polypeptide subsequence performs at least onebiological function of the intact polypeptide in substantially the samemanner, or to a similar extent, as does the intact polypeptide. Forexample, a polypeptide fragment can comprise a recognizable structuralmotif or functional domain such as a DNA binding domain that activatestranscription, e.g., by binding to a specific DNA promoter region anactivation domain, or a domain for protein-protein interactions.

Production of Transgenic Plants

Modification of Traits

The polynucleotides of the invention are favorably employed to producetransgenic plants with various traits, or characteristics, that havebeen modified in a desirable manner, e.g., to improve the seedcharacteristics of a plant. For example, alteration of expression levelsor patterns (e.g., spatial or temporal expression patterns) of one ormore of the transcription factors (or transcription factor homologs) ofthe invention, as compared with the levels of the same protein found ina wild-type plant, can be used to modify a plant's traits. Anillustrative example of trait modification, improved characteristics, byaltering expression levels of a particular transcription factor isdescribed further in the Examples and the Sequence Listing.

Arabidopsis as a Model System

Arabidopsis thaliana is the object of rapidly growing attention as amodel for genetics and metabolism in plants. Arabidopsis has a smallgenome, and well-documented studies are available. It is easy to grow inlarge numbers and mutants defining important genetically controlledmechanisms are either available, or can readily be obtained. Variousmethods to introduce and express isolated homologous genes are available(see Koncz et al. eds., et al. Methods in Arabidopsis Research (1992) etal. World Scientific, New Jersey, N.J., in “Preface”). Because of itssmall size, short life cycle, obligate autogamy and high fertility,Arabidopsis is also a choice organism for the isolation of mutants andstudies in morphogenetic and development pathways, and control of thesepathways by transcription factors (Koncz supra, p. 72). A number ofstudies introducing transcription factors into A. thaliana havedemonstrated the utility of this plant for understanding the mechanismsof gene regulation and trait alteration in plants. (See, for example,Koncz supra, and U.S. Pat. No. 6,417,428).

Arabidopsis Genes in Transgenic Plants.

Expression of genes which encode transcription factors modify expressionof endogenous genes, polynucleotides, and proteins are well known in theart. In addition, transgenic plants comprising isolated polynucleotidesencoding transcription factors may also modify expression of endogenousgenes, polynucleotides, and proteins. Examples include Peng et al.(1997) et al. Genes and Development 11: 3194-3205, and Peng et al.(1999) Nature 400: 256-261. In addition, many others have demonstratedthat an Arabidopsis transcription factor expressed in an exogenous plantspecies elicits the same or very similar phenotypic response. See, forexample, Fu et al. (2001) Plant Cell 13: 1791-1802; Nandi et al. (2000)Curr. Biol. 10: 215-218; Coupland (1995) Nature 377: 482-483; and Weigeland Nilsson (1995) Nature 377: 482-500.

Homologous Genes Introduced into Transgenic Plants.

Homologous genes that may be derived from any plant, or from any sourcewhether natural, synthetic, semi-synthetic or recombinant, and thatshare significant sequence identity or similarity to those provided bythe present invention, may be introduced into plants, for example, cropplants, to confer desirable or improved traits. Consequently, transgenicplants may be produced that comprise a recombinant expression vector orcassette with a promoter operably linked to one or more sequenceshomologous to presently disclosed sequences. The promoter may be, forexample, a plant or viral promoter.

The invention thus provides for methods for preparing transgenic plants,and for modifying plant traits. These methods include introducing into aplant a recombinant expression vector or cassette comprising afunctional promoter operably linked to one or more sequences homologousto presently disclosed sequences. Plants and kits for producing theseplants that result from the application of these methods are alsoencompassed by the present invention.

Transcription Factors of Interest for the Modification of Plant Traits

Currently, the existence of a series of maturity groups for differentlatitudes represents a major barrier to the introduction of new valuabletraits. Any trait (e.g. disease resistance) has to be bred into each ofthe different maturity groups separately, a laborious and costlyexercise. The availability of single strain, which could be grown at anylatitude, would therefore greatly increase the potential for introducingnew traits to crop species such as soybean and cotton.

For many of the specific effects, traits and utilities listed in Table 7and Table 9 that may be conferred to plants, one or more transcriptionfactor genes may be used to increase or decrease, advance or delay, orimprove or prove deleterious to a given trait. Overexpressing orsuppressing one or more genes can impart significant differences inproduction of plant products, such as different fatty acid ratios. Forexample, overexpression of G720 caused a plant to become more freezingtolerant, but knocking out the same transcription factor impartedgreater susceptibility to freezing. Thus, suppressing a gene that causesa plant to be more sensitive to cold may improve a plant's tolerance ofcold.

More than one transcription factor gene may be introduced into a plant,either by transforming the plant with one or more vectors comprising twoor more transcription factors, or by selective breeding of plants toyield hybrid crosses that comprise more than one introducedtranscription factor. Transgenic plants may be crossed with anotherplant or selfed or to produce seed; which may be used to generateprogeny plants having increased tolerance to abiotic stress (“selfing”refers to self-pollinating, or using pollen from one plant to fertilizethe same plant or another plant in the same line, whereas “crossing”generally refers to cross pollination with plant from a different line,such as a non-transformed or wild-type plant, or another transformedplant from a different transgenic line of plants). Crossing provides theadvantage of being able to produce new varieties. The resulting seed maythen be used to grow a progeny plant that is transgenic and hasincreased tolerance to abiotic stress. Generally, the progeny plantswill express mRNA that encodes a DNA-binding protein having a conserveddomain (e.g., an AP2, MYB-related or CCAAT-box binding domain) thatbinds to a DNA molecule, regulates its expression, and induces theexpression of genes and polypeptides that confer to the plant thedesirable trait (e.g., abiotic stress tolerance). In these progenyplants, the mRNA may be expressed at a level greater than in anon-transformed plant that does not overexpress the DNA-binding protein.

A listing of specific effects and utilities that the presently disclosedtranscription factor genes have on plants, as determined by directobservation and assay analysis, is provided in Table 7. Table 7 showsthe polynucleotides identified by SEQ ID NO; GID; and whether thepolynucleotide was tested in a transgenic assay. The first column showsthe polynucleotide SEQ ID NO; the second column shows the GID; the thirdcolumn shows whether the gene was overexpressed (OE) or knocked out (KO)in plant studies; the fourth column shows the trait(s) resulting fromthe knock out or overexpression of the polynucleotide in the transgenicplant; and the fifth column (“Observations”), includes specificexperimental observations made when expression of the polynucleotide ofthe first column was altered.

TABLE 7 Traits categories and effects of transcription factor genes thatare overexpressed (OE) or knocked out (KO) Polynucleotide OE/ SEQ ID NO:GID. KO Trait Alterations Observations 1 G8 OE Altered flowering timeLate flowering Growth regulation; nutrient Altered C/N sensing uptake 3G19 OE Increased tolerance to disease Increased tolerance to Erysiphe;repressed by methyl jasmonate and induced by 1-aminocyclopropane 1-carboxylic acid (ACC) 5 G22 OE Increased tolerance to abiotic Increasedtolerance to high salt stress 7 G24 OE Altered necrosis Reduced size andnecrotic patches Growth regulation; nutrient Altered C/N sensing uptake2217 G27 OE Growth regulation; nutrient Altered C/N sensing uptake 9 G28OE Increased tolerance to disease Increased tolerance to BotrytisIncreased tolerance to Sclerotinia Increased resistance to Erysiphe 11G47 OE Altered stem morphology Altered structure of vascular tissuesIncreased tolerance to abiotic Better root growth under osmotic stressstress Late flowering Altered flowering time Altered architecture andinflorescence Altered architecture development Increased tolerance toabiotic Reduced apical dominance and osmotic stress Increased toleranceto drought 13 G156 KO Altered seed Seed color alteration Growthregulation; nutrient Altered C/N sensing uptake 15 G157 OE Alteredflowering time Altered flowering time (modest level of overexpressiontriggers early flowering, whereas a larger increase delays flowering) 17G162 OE Altered seed oil Increased seed oil content Altered seed proteinIncreased seed protein content 19 G175 OE Increased tolerance to abioticIncreased tolerance to osmotic stress stress Increased tolerance todrought 21 G180 OE Altered seed oil Decreased seed oil Altered floweringtime Early flowering 23 G183 OE Altered flowering time Early floweringAltered light response and/or Constitutive photomorphogenesis shadetolerance Altered C/N sensing Growth regulation; nutrient uptake 25 G188KO Increased susceptibility to Increased susceptibility to Fusariumdisease Better germination under osmotic stress Increased tolerance toabiotic Increased tolerance to drought stress 27 G189 OE Altered sizeIncreased leaf size Growth regulation; nutrient Altered C/N sensinguptake 29 G192 OE Altered flowering time Late flowering Altered seed oilDecreased seed oil content 31 G196 OE Increased tolerance to abioticIncreased tolerance to high salt stress 33 G211 OE Altered leafbiochemistry Increase in leaf xylose Altered architecture Reduced apicaldominance Altered leaf Altered leaf shape 35 G214 OE Altered floweringtime Late flowering Altered leaf biochemistry Increased leaf fatty acidsAltered seed prenyl lipids Increased seed lutein Altered leaf prenyllipids Increased leaf chlorophyll and carotenoids 37 G226 OE Alteredseed protein Increased seed protein Altered trichomes Glabrous, lack oftrichomes Altered root Increased root hairs Increased tolerance toabiotic Increased tolerance to high salt stress Increased tolerance tonitrogen-limited Growth regulation; nutrient medium uptake Altered C/Nsensing 2267 G234 OE Growth regulation; nutrient Altered C/N sensinguptake 2269 G237 OE Growth regulation; nutrient Altered C/N sensinguptake 39 G241 KO Altered seed protein Increased seed protein contentAltered seed oil Decreased seed oil Altered sugar sensing Decreasedgermination and growth on glucose medium 41 G248 OE Increasedsusceptibility to Increased susceptibility to Botrytis disease 43 G254OE Altered sugar sensing Decreased germination and growth on glucosemedium 45 G256 OE Increased tolerance to abiotic Better germination andgrowth in cold stress 47 G278 OE Increased susceptibility to Increasedsusceptibility to Sclerotinia disease 49 G291 OE Altered seed oilIncreased seed oil content 51 G303 OE Increased tolerance to abioticBetter germination on high sucrose and and osmotic stress high NaClIncreased tolerance to drought 53 G312 OE Increased tolerance to abioticBetter germination on high NaCl stress 55 G325 OE Increased tolerance toabiotic Better germination on high sucrose and and osmotic stress NaClIncreased tolerance to drought 57 G343 OE Altered glyphosate sensitivityIncreased resistance to glyphosate Altered size Small plant G2295 G347OE Growth regulation; nutrient Altered C/N sensing uptake 59 G353 OEIncreased tolerance to abiotic Increased seedling vigor on and osmoticstress polyethylene glycol (PEG) Altered size Increased tolerance todrought Altered leaf Reduced size Altered flower Altered leafdevelopment Short pedicels, downward pointing siliques 61 G354 OEAltered size Reduced size Altered light response and/or Constitutivephotomorphogenesis shade tolerance Short pedicels, downward pointingFlower siliques 63 G361 OE Altered flowering time Late flowering 65 G362OE Altered flowering time Late flowering Altered size Reduced sizeAltered trichomes Ectopic trichome formation, increased Morphology:other trichome number Increased pigmentation in seed and embryos, and inother organs 67 G371 OE Increased susceptibility to Increasedsusceptibility to Botrytis disease 69 G390 OE Altered architectureAltered shoot development 71 G391 OE Altered architecture Altered shootdevelopment 73 G409 OE Increased tolerance to disease Increasedtolerance to Erysiphe 75 G427 OE Altered seed oil Increased oil contentAltered seed protein Decreased protein content Growth regulation;nutrient Altered C/N sensing uptake 77 G438 KO Altered stem morphologyReduced lignin Altered architecture Reduced branching 79 G450 OE Alteredseed Increased seed size 81 G464 OE Increased tolerance to abioticBetter germination and growth in heat stress 83 G470 OE Alteredfertility Short stamen filaments 85 G477 OE Increased susceptibility toIncreased susceptibility to Sclerotinia disease Increased sensitivity tooxidative stress Increased tolerance to abiotic stress 87 G481 OEIncreased tolerance to abiotic Altered sugar sensing: better and osmoticstress germination on sucrose media Increased tolerance to drought 89G482 OE Increased tolerance to abiotic Increased tolerance to high saltand osmotic stress 91 G484 KO Altered seed glucosinolates Alteredglucosinolate profile 93 G489 OE Increased tolerance to abioticIncreased tolerance to osmotic stress and osmotic stress Increasedtolerance to drought 95 G490 OE Altered flowering time Early flowering97 G504 OE Altered seed oil composition Decreased seed oil compositionand content; increase in 18:2 fatty acid and decrease in 20:1 fatty acid99 G509 KO Altered seed oil Increased total seed oil and protein Alteredseed protein content 101 G519 OE Altered seed oil Increased seed oilcontent 103 G545 OE Increased tolerance to abiotic Susceptible to highsalt and osmotic stress Increased susceptibility to Erysiphe Increasedsusceptibility to Increased susceptibility to disease Pseudomonas Growthregulation; nutrient Increased susceptibility to Fusarium uptakeIncreased tolerance to phosphate-free medium Altered C/N sensing 105G546 OE Altered hormone sensitivity Decreased sensitivity to abscisicacid (ABA) 107 G561 OE Altered seed oil Increased seed oil contentTolerance to abiotic stress Increased tolerance to potassium-free medium109 G562 OE Altered flowering time Late flowering 111 G567 OE Alteredseed oil Increased total seed oil/protein content Altered seed proteinIncreased total seed oil/protein content Altered sugar sensing Decreasedseedling vigor on high glucose 113 G568 OE Altered architecture Alteredbranching 115 G584 OE Altered seed Large seeds 117 G585 OE Alteredtrichomes Reduced trichome density 119 G590 KO Altered seed oilIncreased seed oil content OE Altered flowering time Early flowering OEGrowth regulation; nutrient Altered C/N sensing uptake 121 G594 OEIncreased susceptibility to Increased susceptibility to Sclerotiniadisease 123 G597 OE Altered seed protein Altered seed protein content125 G598 OE Altered seed oil Increased seed oil 127 G634 OE Alteredtrichomes Increased trichome density and size Altered light responseand/or Increased shade tolerance; lack of shade shade toleranceavoidance phenotype Increased tolerance to abiotic Increased droughttolerance stress 129 G635 OE Variegation Altered coloration Growthregulation; nutrient Altered C/N sensing uptake 131 G636 OE Alteredsenescence Premature senescence 133 G638 OE Altered flower Alteredflower development 135 G652 KO Altered seed prenyl lipids Increase inalpha-tocopherol 2391 G657 OE Growth regulation; nutrient Altered C/Nsensing uptake 137 G663 OE Altered pigment Increased anthocyanins inleaf, root, seed 139 G664 OE Increased tolerance to abiotic Bettergermination and growth in cold stress 141 G674 OE Altered leaf Darkgreen, upwardly oriented leaves 143 G676 OE Altered trichomes Reducedtrichome number, ectopic trichome formation 145 G680 OE Altered sugarsensing Reduced germination on glucose medium 147 G682 OE Alteredtrichomes Glabrous, lack of trichomes Increased tolerance to abioticBetter germination and growth in heat and osmotic stress Increased roothairs Altered root Increased tolerance to drought Growth regulation;nutrient Altered C/N sensing uptake 149 G715 OE Altered seed oilIncreased seed oil content 151 G720 OE Increased tolerance to abioticMore freezing tolerant KO and osmotic stress Increased susceptibility tofreezing Increased susceptibility to abiotic and osmotic stress 153 G736OE Altered flowering time Late flowering Altered leaf Altered leaf shape155 G748 OE Altered seed prenyl lipids Increased lutein content Alteredstem morphology More vascular bundles in stem Altered flowering timeLate flowering 2413 G760 OE Growth regulation; nutrient Altered C/Nsensing uptake 157 G779 OE Altered fertility Reduced fertility Alteredflower Homeotic transformations 159 G789 OE Altered flowering time Earlyflowering 161 G801 OE Increased tolerance to abiotic Better germinationon high NaCl stress 163 G849 KO Altered seed oil Increased seed oilcontent Altered seed protein Altered seed protein content 165 G859 OEAltered flowering time Late flowering 167 G864 OE Increased tolerance toabiotic Better germination in heat stress Increased tolerance to drought169 G867 OE Increased tolerance to abiotic Better seedling vigor on highsalt and osmotic stress Better seedling vigor on high sucrose Alteredsugar sensing 171 G869 OE Altered seed oil composition Altered seedfatty acid composition 2437 G872 OE Growth regulation; nutrient AlteredC/N sensing uptake 173 G877 KO Embryo lethal Embryo lethal phenotype:potential herbicide target 175 G881 OE Increased susceptibility toIncreased susceptibility to Erysiphe disease 177 G892 KO Altered seedprotein Altered seed protein content Altered seed oil Altered seed oilcontent 179 G896 KO Increased susceptibility to Increased susceptibilityto Fusarium disease 2451 G904 OE Growth regulation; nutrient Altered C/Nsensing uptake 181 G910 OE Altered flowering time Late flowering 183G911 OE Tolerance to abiotic stress Increased growth on potassium-freemedium 185 G912 OE Increased tolerance to abiotic Freezing tolerantstress Increased survival in drought Altered pigment conditions Alteredsugar sensing Dark green color Reduced cotyledon expansion in glucose187 G913 OE Increased tolerance to abiotic Increased tolerance tofreezing stress Late flowering Altered flowering time Increasedtolerance to drought 189 G922 OE Increased tolerance to abiotic Bettergermination on high sucrose and osmotic stress Better germination,increased root growth on high salt Increased tolerance to drought 191G926 KO Altered hormone sensitivity Reduced sensitivity to ABA Increasedtolerance to abiotic Increased tolerance to high salt and and osmoticstress sucrose 2065 G932 OE Growth regulation; nutrient C/N sensing:increased tolerance to low uptake nitrogen 2461 G937 OE Growthregulation; nutrient Altered C/N sensing uptake 2471 G958 OE Growthregulation; nutrient Altered C/N sensing uptake 193 G961 KO Altered seedoil Increased seed oil content 2479 G964 KO Growth regulation; nutrientAltered C/N sensing uptake 195 G971 OE Altered flowering time Lateflowering 197 G974 OE Altered seed oil Altered seed oil content 199 G975OE Altered leaf biochemistry Increased fatty acids and wax in leavesIncreased tolerance to abiotic Increased tolerance to drought andosmotic stress Altered C/N sensing Growth regulation; nutrient uptake201 G979 KO Altered seed Altered seed development, ripening, Growthregulation; nutrient and germination uptake Altered C/N sensing 203 G987KO Altered leaf fatty acids Reduction in 16:3 fatty acids Altered leafbiochemistry Altered prenyl lipids: chlorophyll, tocopherol, carotenoid205 G988 OE Altered seed protein Increased seed protein content Alteredflower Enlarged floral organs, short pedicels Altered architectureReduced lateral branching Altered stem morphology Thicker stem, altereddistribution of Growth regulation; nutrient vascular bundles uptakeAltered C/N sensing 207 G1040 OE Altered seed Smaller and more roundedseeds 209 G1047 OE Increased tolerance to disease Increased tolerance toFusarium 2515 G1048 OE Altered light response and/or Increased shadetolerance; lack of shade shade tolerance avoidance phenotype 2517 G1049OE Growth regulation; nutrient Altered C/N sensing uptake 211 G1051 OEAltered flowering time Late flowering 213 G1052 OE Altered floweringtime Late flowering 215 G1062 KO Altered seed Altered seed shape 217G1063 OE Altered leaf Altered leaf shape, dark green color Alteredinflorescence Altered inflorescence development Altered flower Alteredflower development, ectopic carpel tissue 219 G1064 OE Increasedsusceptibility to Increased sensitivity to Botrytis disease 221 G1069 OEAltered hormone sensitivity Reduced ABA sensitivity Increased toleranceto abiotic Better germination under osmotic stress and osmotic stressIncreased tolerance to drought Growth regulation; nutrient Altered C/Nsensing uptake 223 G1073 OE Altered size Substantially increased plantsize Altered seed Increased seed yield Increased tolerance to abioticIncreased tolerance to drought stress 225 G1075 OE Altered flowerReduced or absent petals, sepals and stamens 227 G1084 OE Increasedsusceptibility to Increased susceptibility to Botrytis disease 229 G1089KO Increased tolerance to osmotic Better germination under osmoticstress stress 231 G1134 OE Altered hormone sensitivity Altered responseto ethylene: longer hypocotyls and lack of apical hook 233 G1140 OEAltered flower Altered flower development 235 G1143 OE Altered seed oilAltered seed oil content 237 G1146 OE Altered leaf Altered leafdevelopment 239 G1196 KO Increased susceptibility to Increasedsusceptibility to Botrytis disease 241 G1198 OE Altered seed oilIncreased seed oil content 243 G1225 OE Altered flowering time Earlyflowering Altered sugar sensing Better germination on sucrose andglucose media 245 G1226 OE Altered seed oil Increased seed oil content247 G1229 OE Altered seed oil Decreased seed oil content 2555 G1246 OEGrowth regulation; nutrient Altered C/N sensing uptake 249 G1255 OEIncreased susceptibility to Increased susceptibility to Botrytis diseaseIncreased seed size Altered seed Reduced apical dominance Alteredarchitecture Altered C/N sensing Growth regulation; nutrient uptake 251G1266 OE Increased tolerance to disease Increased tolerance to ErysipheGrowth regulation; nutrient Altered C/N sensing uptake 253 G1275 OEAltered architecture Reduced apical dominance 255 G1305 OE Increasedtolerance to abiotic Reduced chlorosis in heat stress 257 G1322 OEIncreased tolerance to abiotic Increased seedling vigor in cold stressReduced size Altered size Increase in M39480 Leaf glucosinolatesConstitutive photomorphogenesis Altered light response and/or AlteredC/N sensing: increased shade tolerance tolerance to low nitrogen Growthregulation; nutrient uptake 259 G1323 OE Altered seed oil Decreased seedoil Altered seed protein Increased seed protein 261 G1330 OE Alteredhormone sensitivity Ethylene insensitive when germinated in the dark onACC 263 G1331 OE Altered light response and/or Constitutivephotomorphogenesis shade tolerance Altered C/N sensing Growthregulation; nutrient uptake 265 G1332 OE Altered trichomes Reducedtrichome density Growth regulation; nutrient Altered C/N sensing uptake267 G1363 OE Increased tolerance to disease Increased tolerance toFusarium 269 G1411 OE Altered architecture Loss of apical dominance 2607G1412 KO Altered light response and/or Increased shade tolerance; lackof shade shade tolerance avoidance phenotype 271 G1417 KO Altered seedoil Increase in 18:2, decrease in 18:3 fatty acids 273 G1419 OE Alteredseed protein Increased seed protein 275 G1449 OE Altered flower Alteredflower structure 277 G1451 OE Altered size Increased plant size OEAltered leaf Large leaf size KO Altered seed oil Altered seed oilcontent 279 G1452 OE Altered trichomes Reduced trichome density Alteredleaf Altered leaf shape, dark green color Altered hormone sensitivityReduced sensitivity to ABA Altered flowering time Better germination onsucrose, salt Increased tolerance to abiotic Late flowering and osmoticstress Increased tolerance to drought 281 G1463 OE Altered senescencePremature senescence 283 G1471 OE Altered seed oil Increased seed oilcontent 285 G1478 OE Altered seed protein Decreased seed protein contentAltered flowering time Late flowering Altered seed oil Increased seedoil content 287 G1482 KO Altered pigment Increased anthocyanins OEAltered root Increased root growth 289 G1488 OE Altered seed proteinAltered seed protein content Altered light response and/or Constitutivephotomorphogenesis shade tolerance Reduced apical dominance, shorterAltered architecture stems 291 G1494 OE Altered flowering time Earlyflowering Altered light response and/or Long hypocotyls, altered leafshape shade tolerance Pale green leaves, altered leaf shape Altered leafAltered C/N sensing Growth regulation; nutrient uptake 293 G1496 OEAltered seed oil Altered seed oil content 295 G1499 OE Altered pigmentDark green color Altered architecture Altered plant architecture Alteredflower Altered floral organ identity and development 297 G1519 KO Embryolethal Embryo lethal phenotype: potential herbicide target 299 G1526 KOAltered seed oil Increased seed oil content 301 G1540 OE Altered celldifferentiation Reduced cell differentiation in meristem 303 G1543 OEAltered architecture Altered architecture, compact plant Morphology:other Dark green color Altered seed oil Decreased seed oil Altered leafprenyl lipids Increase in chlorophyll a and b 2667 G1587 OE Growthregulation; nutrient Altered C/N sensing uptake 305 G1634 OE Alteredseed oil Increased seed oil content Altered seed protein Decreased seedprotein content 307 G1637 OE Altered seed protein Altered seed proteincontent 309 G1640 OE Altered seed oil Increased seed oil 311 G1645 OEAltered inflorescence Altered inflorescence structure 313 G1646 OEAltered seed oil Increased seed oil content 2685 G1649 OE Growthregulation; nutrient Altered C/N sensing uptake 315 G1652 OE Alteredseed protein Increased seed protein content G1666 KO Growth regulation;nutrient Altered C/N sensing uptake 317 G1672 OE Altered seed oilAltered seed oil content 319 G1677 OE Altered seed protein Altered seedprotein content Altered seed oil Altered seed oil content 321 G1749 OEAltered necrosis Formation of necrotic lesions 323 G1750 OE Altered seedoil Increased seed oil content Growth regulation; nutrient Altered C/Nsensing uptake 325 G1756 OE Increased susceptibility to Increasedsusceptibility to Botrytis disease 327 G1765 OE Altered seed oilIncreased seed oil content 329 G1777 OE Altered seed oil Increased seedoil content Altered seed protein Decreased seed protein content 331G1792 OE Altered leaf Dark green, shiny leaves Increased tolerance todisease Increased resistance to Erysiphe Increased tolerance to abioticIncreased resistance to Botrytis and osmotic stress Increased resistanceto Fusarium Increased tolerance to nitrogen-limited medium Increasedtolerance to drought 333 G1793 OE Altered seed oil Increased seed oilcontent 335 G1794 OE Altered architecture Altered architecture, bushierplant Altered light response and/or Reduced apical dominance shadetolerance Constitutive photomorphogenesis Increased tolerance to osmoticIncreased sensitivity to high PEG and abiotic stress Reduced root growth337 G1804 OE Altered flowering time Late flowering Altered sugar sensingAltered sugar sensing: more sensitive to glucose in germination assays339 G1818 OE Altered seed protein Increased protein content 341 G1820 OEAltered flowering time Early flowering Altered hormone sensitivityReduced ABA sensitivity Altered seed protein Increased seed proteincontent Increased tolerance to abiotic Better germination in high NaCland osmotic stress Increased tolerance to drought 2733 G1835 OE Growthregulation; nutrient Altered C/N sensing uptake 343 G1836 OE Increasedtolerance to abiotic Better germination in high salt and osmotic stressIncreased tolerance to drought 345 G1838 OE Altered seed oil Increasedseed oil content 347 G1841 OE Increased tolerance to abiotic Bettergermination under heat stress and osmotic stress Early flowering Alteredflowering time 349 G1842 OE Altered flowering time Early flowering 351G1843 OE Altered flowering time Early flowering 353 G1852 OE Increasedtolerance to abiotic Better root growth under osmotic stress and osmoticstress 355 G1863 OE Altered leaf Altered leaf shape and coloration 357G1880 KO Increased tolerance to disease Increased resistance to Botrytis359 G1895 OE Altered flowering time Late flowering 361 G1902 OE Alteredseed oil Increased seed oil content 363 G1903 OE Altered seed proteinDecreased seed protein content 365 G1919 OE Increased tolerance todisease Increased tolerance to Botrytis 367 G1927 OE Increased toleranceto disease Increased tolerance to Sclerotinia 369 G1930 OE Increasedtolerance to osmotic Better germination under osmotic stress stressAltered C/N sensing Growth regulation; nutrient uptake 371 G1936 KOIncreased susceptibility to Increased susceptibility to Sclerotiniadisease Increased susceptibility to Botrytis 373 G1944 OE Alteredsenescence Early senescence 375 G1946 OE Altered seed oil Increased seedoil content Altered seed protein Decreased seed protein content Alteredflowering time Early flowering Growth regulation; nutrient Increasedroot growth on phosphate- uptake free media 377 G1947 KO Alteredfertility Reduced fertility 379 G1948 OE Altered seed oil Increased seedoil content 381 G1950 OE Increased tolerance to disease Increasedtolerance to Botrytis 383 G1958 KO Altered size Reduced size and rootmass Altered seed oil Increased seed oil content Altered seed proteinIncreased seed protein content. 2157 G1995 OE Altered light responseand/or Increased shade tolerance; lack of shade shade toleranceavoidance phenotype 385 G2007 OE Altered flowering time Late flowering387 G2010 OE Altered flowering time Early flowering 389 G2053 OEIncreased tolerance to abiotic Increased root growth under osmotic andosmotic stress stress Growth regulation; nutrient Increased tolerance todrought uptake Altered C/N sensing 2797 G2057 OE Growth regulation;nutrient Altered C/N sensing uptake 391 G2059 OE Altered seed oilAltered seed oil content Altered seed protein Altered seed proteincontent 393 G2085 OE Altered seed Increased seed size and altered seedcolor 395 G2105 OE Altered seed Large, pale seeds 397 G2110 OE Increasedtolerance to abiotic Increased tolerance to high salt and osmotic stressIncreased tolerance to drought 399 G2114 OE Altered seed Increased seedsize 401 G2117 OE Altered seed protein Increased seed protein contentAltered C/N sensing 403 G2123 OE Altered seed oil Increased seed oilcontent 405 G2130 OE Increased tolerance to abiotic Better germinationin heat stress 2163 G2131 OE Growth regulation; nutrient Altered C/Nsensing uptake 407 G2133 OE Altered herbicide sensitivity Increasedtolerance to glyphosate Altered flowering time Late flowering Increasedtolerance to abiotic Increased drought tolerance and osmotic stressAltered C/N sensing Growth regulation; nutrient uptake 409 G2138 OEAltered seed oil Increased seed oil content 411 G2140 OE Altered hormonesensitivity Decreased sensitivity to ABA Increased tolerance to abioticBetter germination on high NaCl and and osmotic stress sucrose Increasedtolerance to drought 413 G2143 OE Altered inflorescence Alteredinflorescence development Altered leaf Altered leaf shape, dark greencolor Altered flower Altered flower development, ectopic carpel tissue415 G2144 OE Altered flowering time Early flowering Altered leaf Palegreen leaves, altered leaf shape Light response Long hypocotyls, alteredleaf shape Growth regulation; nutrient Altered C/N sensing uptake 2827G2145 OE Growth regulation; nutrient Altered C/N sensing uptake 417G2153 OE Increased tolerance to osmotic Better germination under osmoticstress stress 419 G2155 OE Altered flowering time Late flowering 421G2192 OE Altered seed oil Altered seed fatty acid composition 423 G2295OE Altered flowering time Early flowering 425 G2340 OE Altered seedglucosinolates Altered glucosinolate profile 427 G2343 OE Altered seedoil Increased seed oil content 429 G2346 OE Altered size Enlargedseedlings 431 G2347 OE Altered flowering time Early flowering 433 G2379OE Increased tolerance to osmotic Increased seedling vigor on highstress sucrose media 435 G2430 OE Increased tolerance to abioticIncreased tolerance to heat and osmotic stress Increased leaf size,faster development Altered size 437 G2505 OE Increased tolerance toabiotic Increased tolerance to drought and osmotic stress Increasedshade tolerance; lack of shade Altered light response and/or avoidancephenotype shade tolerance 439 G2509 OE Altered seed oil Decreased seedoil content Altered seed protein Increased seed protein content Alteredseed prenyl lipids Increase in alpha-tocopherol Altered architectureReduced apical dominance Altered flowering time Early flowering 2875G2512 OE Growth regulation; nutrient Altered C/N sensing uptake 441G2517 OE Altered herbicide sensitivity Increased tolerance to glyphosate443 G2520 OE Altered seed prenyl lipids Altered tocopherol compositionGrowth regulation; nutrient Altered C/N sensing uptake 2185 G2535 OEGrowth regulation; nutrient Altered C/N sensing uptake 445 G2555 OEAltered light response and/or Constitutive photomorphogenesis shadetolerance Increased susceptibility to Botrytis Increased susceptibilityto disease 447 G2557 OE Altered leaf Altered leaf shape, dark greencolor Altered flower Altered flower development, ectopic carpel tissue449 G2583 OE Altered leaf Glossy, shiny leaves 451 G2701 OE Increasedtolerance to abiotic Better germination on high NaCl and and osmoticstress sucrose Increased tolerance to drought 2191 G2718 OE Growthregulation; nutrient Altered C/N sensing uptake 453 G2719 OE Increasedtolerance to osmotic Increased seedling vigor on high stress sucroseGrowth regulation; nutrient Altered C/N sensing uptake 2193 G2776 OEIncreased tolerance to abiotic Increased tolerance to drought andosmotic stress 455 G2789 OE Altered hormone sensitivity Bettergermination on high sucrose Increased tolerance to abiotic Reduced ABAsensitivity and osmotic stress Increased tolerance to drought Alteredlight response and/or Increased shade tolerance; lack of shade shadetolerance avoidance phenotype Growth regulation; nutrient Altered C/Nsensing uptake 457 G2830 KO Altered seed oil Increased seed oil content1951 G12 KO Altered hormone sensitivity Increased sensitivity to ACC OEAltered necrosis Leaf and hypocotyl necrosis 1953 G30 OE Altered leafGlossy green leaves Altered light response and/or Increased shadetolerance; lack of shade shade tolerance avoidance phenotype 1975 G231OE Altered leaf biochemistry Increased leaf unsaturated fatty acidsAltered seed oil Increased seed oil content Altered seed proteinDecreased seed protein content 1979 G247 OE Altered trichomes Alteredtrichome distribution, reduced trichome density 1991 G370 KO Alteredsize Reduced size, shiny leaves OE Altered trichome ectopic trichomeformation 2009 G485 OE Altered flowering time Early flowering KO Alteredflowering time Late flowering 2061 G839 OE Growth regulation; nutrientIncreased tolerance to nitrogen-limited uptake medium 2099 G1357 OEAltered leaf Altered leaf shape, dark green leaves Increased toleranceto abiotic Increased tolerance to cold and osmotic stress Insensitive toABA Altered hormone sensitivity Late flowering Altered flowering time2126 G1646 OE Altered seed oil Increased seed oil content 2142 G1816 OEAltered sugar sensing Increased tolerance to glucose Growth regulation;nutrient Altered C/N sensing; less anthocyanin uptake onnitrogen-limited medium Increased tolerance to abiotic Increasedtolerance to osmotic stress and osmotic stress Increased root hairsAltered root Glabrous leaves Altered trichomes Increased tolerance tonitrogen-limited medium 2147 G1888 OE Altered size Reduced size, darkgreen leaves 2153 G1945 OE Altered flowering time Late flowering Alteredleaf Altered leaf shape 2195 G2826 OE Altered flower Aerial rosettesAltered trichomes Ectopic trichome formation 2197 G2838 OE Alteredtrichomes Increased trichome density Altered flowering time Lateflowering Altered flower Flower: multiple alterations Leaves Aerialrosettes Altered size Dark green leaves Increased seedling size 2199G2839 OE Increased tolerance to abiotic Better germination on highsucrose and osmotic stress Increased tolerance to drought Alteredinflorescence Downward pedicels Altered size Reduced size

Table 8 shows the polypeptides identified by SEQ ID NO; Gene ID (GID)No.; the transcription factor family to which the polypeptide belongs,and conserved domains of the polypeptide. The first column shows thepolypeptide SEQ ID NO; the third column shows the transcription factorfamily to which the polynucleotide belongs; and the fourth column showsthe amino acid residue positions of the conserved domain in amino acid(AA) co-ordinates.

TABLE 8 Gene families and conserved domains Polypeptide GID ConservedDomains in SEQ ID NO: No. Family Amino Acid Coordinates 4 G19 AP2 76-1456 G22 AP2 89-157 10 G28 AP2 145-213 12 G47 AP2 11-80 38 G226 MYB-related34-69 60 G353 Z-C2H2 41-61, 84-104 62 G354 Z-C2H2 42-62, 88-109 88 G481CAAT 20-110 90 G482 CAAT 26-116 2010 G485 CAAT 20-110 94 G489 CAAT57-156 128 G634 TH 62-147, 189-245 148 G682 MYB-related 27-63 168 G864AP2 119-186 170 G867 AP2 59-124 186 G912 AP2 51-118 188 G913 AP2 62-128190 G922 SCR 225-242 192 G926 CAAT 131-225 200 G975 AP2 4-71 2516 G1048bZIP 138-190 222 G1069 AT-hook 67-74 224 G1073 AT-hook 33-42, 78-175 226G1075 AT-hook 78-85 2102 G1364 CAAT 29-119 270 G1411 AP2 87-154 278G1451 ARF 22-357 304 G1543 HB 135-195 332 G1792 AP2 17-85 2142 G1816MYB-related 26-61 342 G1820 CAAT 70-133 344 G1836 CAAT 30-164 370 G1930AP2 59-124 2608 G1995 Z-C2H2 93-113 408 G2133 AP2 11-83 418 G2153AT-hook 75-94, 162-206 420 G2155 AT-hook 18-38 2172 G2345 CAAT 28-118440 G2509 AP2 89-156 450 G2583 AP2 4-71 G2718 MYB-related 28-64

Examples of some of the utilities that may be desirable in plants, andthat may be provided by transforming the plants with the presentlydisclosed sequences, are listed in Table 9. Many of the transcriptionfactors listed in Table 9 may be operably linked with a specificpromoter that causes the transcription factor to be expressed inresponse to environmental, tissue-specific or temporal signals. Forexample, G362 induces ectopic trichomes on flowers but also producessmall plants. The former may be desirable to produce insect or herbivoreresistance, or increased cotton yield, but the latter may be undesirablein that it may reduce biomass. However, by operably linking G362 with aflower-specific promoter, one may achieve the desirable benefits of thegene without affecting overall biomass to a significant degree. Forexamples of flower specific promoters, see Kaiser et al. (supra). Forexamples of other tissue-specific, temporal-specific or induciblepromoters, see the above discussion under the heading “Vectors,Promoters, and Expression Systems”.

TABLE 9 Genes, traits and utilities that affect plant characteristicsTranscription factor genes Trait Category Phenotype(s) that impacttraits Utility Abiotic stress Effect of chilling on plants Increasedtolerance: G256; G664; G1322 Improved germination, growth rate, earlierplanting, yield Germination in cold Increased tolerance: G256; G664Earlier planting; improved survival, yield Freezing tolerance G720 (G720KO is more Earlier planting; susceptible); G912; G913 improved quality,survival, yield Drought Increased tolerance: G47; G175; G188; G303;Improved survival, G325; G353; G481; G489; vigor, appearance, G634;G682; G864; G912; yield G913; G922; G975; G1069; G1452; G1792; G1820;G1836; G2053; G2110; G2133; G2140; G2505; G2701; G2776; G2789; G2839Heat Increased tolerance: G464; G682; G864; G1305; Improved G1841;G2130; G2430 germination, growth rate, later planting, yield Osmoticstress Increased sensitivity: G1794 Abiotic stress response manipulationIncreased tolerance: G47; G175; G188; G303; Improved germination G325;G353; G489; G922; rate, seedling vigor, G926; G1069; G1089; survival,yield G1452; G1816; G1820; G1852; G1930; G2053; G2140; G2153; G2379;G2701; G2719; G2789; G2839 Salt tolerance More susceptible: G545Manipulation of response to high salt conditions Increased tolerance:G22; G196; G226; G312; Improved germination G482; G801; G867; G922;rate, survival, yield; G1836; G2110 extended growth range Nitrogenstress Sensitivity to N limitation: G1794 Manipulation of response tolow nutrient conditions Tolerance to N limitation: G225; G226; G839;G1792; Improved yield and G1816 nutrient stress tolerance, decreasedfertilizer usage Phosphate stress Tolerance to P limitation: G545; G561;G911; G1946 Improved yield and nutrient stress tolerance, decreasedfertilizer usage Oxidative stress G477 Improved yield, quality,ultraviolet and chemical stress tolerance Herbicide Glyphosate G343;G2133; G2517 Generation of glyphosate-resistant plants to improve weedcontrol Hormone Abscisic acid (ABA) sensitivity sensitivity Reducedsensitivity to G546; G926; G1069; G1357; Modification of seed ABA:G1452; G1820; G2140; development, G2789 improved seed dormancy, cold anddehydration tolerance Sensitivity to ethylene Altered response: G1134Manipulation of fruit ripening Insensitive to ethylene: G1330 DiseaseBotrytis Increased susceptibility: G248; G371; G1064; G1084;Manipulation of G1196; G1255; G1756; response to disease G1936; G2555organism Increased resistance or G28; G1792; G1880; G1919; Improvedyield, tolerance: G1950 appearance, survival, extended range FusariumIncreased susceptibility: G188; G545; G896 Manipulation of response todisease organism Increased resistance or G1047; G1792 Improved yield,tolerance: appearance, survival, extended range Erysiphe Increasedsusceptibility: G545; G881 Manipulation of response to disease organismIncreased resistance or G19; G28; G409; G1266; Improved yield,tolerance: G1363; G1792 appearance, survival, extended range PseudomonasIncreased susceptibility: G545 Manipulation of response to diseaseorganism Sclerotinia Increased susceptibility: G278; G477; G594; G1936Manipulation of response to disease organism Increased resistance orG28; G1927 Improved yield, tolerance: appearance, survival, extendedrange Growth Altered sugar sensing regulator Decreased tolerance toG241; G254; G567; G680; Alteration of energy sugars: G912; G1804balance, Increased tolerance to G481; G867; G1225; G1816 photosyntheticrate, sugars: carbohydrate accumulation, biomass production, source-sinkrelationships, senescence; alteration of storage compound accumulationin seeds Altered C/N sensing G682; G226; G1816; G2718; G24; G545; G760;G937; G971; G988; G1069; G1322; G1587; G1666; G2117; G2131; G2520;G2789; G8; G27; G156; G183; G189; G234; G237; G347; G427; G590; G635;G657; G872; G904; G912; G932; G958; G964; G975; G979; G1049; G1246;G1255; G1266; G1331; G1332; G1494; G1649; G1750; G1816; G1835; G1930;G2053; G2057; G2133; G2144; G2145; G2295; G2512; G2535; G2719 Floweringtime Early flowering G157; G180; G183; G485 Faster generation (OE);G490; G590; G789; time; synchrony of G1225; G1494; G1820; flowering;additional G1841; G1842; G1843; harvests within a G1946; G2010; G2144;growing season, G2295; G2347; G2509 shortening of breeding programs Lateflowering G8; G47; G157; G192; G214; Increased yield or G231; G361;G362; G485 biomass, alleviate risk (KO); G562; G736; G748; of transgenicpollen G859; G910; G913; G971; escape, synchrony of G1051; G1052; G1357;flowering G1452; G1478; G1804; G1895; G1945; G2007; G2133; G2155; G2838General Altered flower structure development Stamen: G988; G1075; G1140;Ornamental and morphology G1499; G2557 modification of plant Sepal:G1075; G1140; G2557 architecture, improved Petal: G638; G1075; G1140; orreduced fertility to G1449; G1499; G2557 mitigate escape of Pedicel:G353; G354; G988 transgenic pollen, Carpel: G1063; G1140; G2143;improved fruit size, G2143; G2557 shape, number or Multiple alterations:G638; G988; G1063; G1140; yield G1449; G1499; G2143; G2557 Enlargedfloral organs: G988; G1449; G2838 Siliques: G353; G354 G470; G779; G988;G1075; G1140; G1499; G1947; Reduced fertility: G2143; G2557 Aerialrosettes G638; G779; G1140; G1499 G1995; G2826; G2838 Inflorescencearchitectural change Altered branching pattern: G47; G1063; G1645; G2143Ornamental Short internodes/bushy G47 modification of flowerinflorescences: architecture; timing of Internode elongation: G1063flowering; altered Lack of inflorescence: G1499; G2143 plant habit foryield or harvestability benefit; reduction in pollen production ofgenetically modified plants; manipulation of seasonality and annual orperennial habit; manipulation of determinate vs. indeterminate growthAltered shoot meristem development Stem bifurcations: G390; G391Ornamental modification of plant architecture, manipulation of growthand development, increase in leaf numbers, modulation of branchingpatterns to provide improved yield or biomass Altered branching patternG427; G568; G988; G1543; Ornamental G1794 modification of plantarchitecture, improved lodging resistance Apical dominance Reducedapical dominance: G47; G211; G1255; G1275; Ornamental G1411; G1488;G1794; modification of plant G2509 architecture Altered trichomedensity; development, or structure Reduced or no trichomes: G225; G226;G247; G585; Ornamental G676; G682; G1332; G1452; modification of plantG1816 architecture, increased Ectopic trichomes/altered G247; G362;G370; G676; plant product (e.g., trichome development/cell G2826diterpenes, cotton) fate: productivity, insect Increase in trichomeG362; G634; G838; G2838 and herbivore number, size or density:resistance Stem morphology and G47; G438; G748; G988; Modulation oflignin altered vascular tissue G1488 content; improvement structure ofwood, palatability of fruits and vegetables Root development Increasedroot growth and G1482 Improved yield, stress proliferation: tolerance;anchorage Increased root hairs: G225; G226; G1816 Altered seeddevelopment, G979 ripening and germination Cell differentiation and cellG1540 Increase in carpel or proliferation fruit development; improveregeneration of shoots from callus in transformation ormicro-propagation systems Rapid development G2430 Promote fasterdevelopment and reproduction in plants Senescence Premature senescence:G636; G1463; G1944 Improvement in response to disease, fruit ripeningLethality when G877; G1519 Herbicide target; overexpressed ablation ofspecific tissues or organs such as stamen to prevent pollen escapeNecrosis G12, G24 Disease resistance Plant size Increased plant sizeG1073; G1451 Improved yield, biomass, appearance Larger seedlings G2346;G2838 Increased survival and vigor of seedlings, yield Dwarfed or morecompact G24; G343; G353; G354; Dwarfism, lodging plants G362; G370;G1008; G1277; resistance, G1543; G1794; G1958 manipulation ofgibberellin responses Leaf Dark green leaves G674; G912; G1063; G1357;Increased morphology G1452; G1482; G1499; photosynthesis, G1792; G1863;G1888; biomass, appearance, G2143; G2557; G2838 yield Change in leafshape G211; G353; G674; G736; Ornamental G1063; G1146; G1357;applications G1452; G1494; G1543; G1863; G2143; G2144 Altered leaf size:Increased leaf size, number G189; G214; G1451; G2430 Increased yield, ormass: ornamental applications Light green leaves G1494; G2144 Ornamentalapplications Variegation G635 Ornamental applications Glossy leaves G30;G1792; G2583 Ornamental applications, manipulation of wax composition,amount, or distribution Seed Altered seed coloration G156; G2105; G2085Appearance morphology Seed size and shape Increased seed size: G450;G584; G1255; G2085; Yield, appearance G2105; G2114 Decreased seed size:G1040 Appearance Altered seed shape: G1040; G1062 Appearance LeafIncreased leaf wax G975; G1792; G2583 Insect, pathogen biochemistryresistance Leaf prenyl lipids Reduced chlorophyll: G987 Increase intocopherols G652; G987; G2509 Increased lutein content G748 Increase inchlorophyll or G214; G1543 carotenoids: Leaf insoluble sugars Increasein leaf xylose G211 Increased leaf anthocyanins G663; G1482; G1888 Leaffatty acids Reduction in leaf fatty G987 acids: Increase in leaf fattyacids: G214 Seed Seed oil content biochemistry Increased oil content:G162; G291; G427; G509; Improved oil yield G519; G561; G590; G598;Reduced caloric G629; G715; G849; G961; content G1198; G1226; G1471;G1478; G1526; G1640; G1646; G1750; G1765; G1777; G1793; G1838; G1902;G1946; G1948; G1958, G2123; G2138; G2343; G2830 Decreased oil content:G180; G192; G241; G504; G1143; G1229; G1323; G1543; G2509 Altered oilcontent: G567; G892; G974; G1451; G1496; G1646; G1672; G1677 Alteredfatty acid content: G869; G1417; G2192 Seed protein content Increasedprotein content: G162; G226; G241; G509; Improved protein G988; G1323;G1419; yield, nutritional value G1652; G1818; G1820; Reduced caloricG1958; G2117; G2509 content Decreased protein content: G427; G1478;G1777; G1903; G1946 Altered protein content: G162; G567; G597; G849;G892; G1634; G1637; G1677 Altered seed prenyl lipid G652; G2509; G2520Improved antioxidant content or composition and vitamin E content Seedglucosinolate Altered profile: G484; G2340 Increased seed anthocyaninsG362; G663 Root Increased root anthocyanins G663 Biochemistry LightAltered cotyledon, G183; G354; G634; G1048; Improved shaderesponse/shade hypocotyl, petiole G1322; G1331; G1412; tolerance:potential for tolerance development; altered leaf G1488; G1494; G1794;increased planting orientation; constitutive G1995, G2144; G2505;densities and yield photomorphogenesis; G2555; G2789; enhancementphotomorphogenesis in low light Pigment Increased anthocyanin levelG362; G663; G1482 Enhanced health benefits, improved ornamentalappearance, increased stress resistance, attraction of pollinating andseed- dispersing animals Abbreviations: N = nitrogen P = phosphate ABA =abscisic acid C/N = carbon/nitrogen balanceDetailed Description of Genes, Traits and Utilities that Affect PlantCharacteristics

The following descriptions of traits and utilities associated with thepresent transcription factors offer a more comprehensive descriptionthan that provided in Table 9.

Abiotic Stress, General Considerations

Plant transcription factors can modulate gene expression, and, in turn,be modulated by the environmental experience of a plant. Significantalterations in a plant's environment invariably result in a change inthe plant's transcription factor gene expression pattern. Alteredtranscription factor expression patterns generally result in phenotypicchanges in the plant. Transcription factor gene product(s) in transgenicplants then differ(s) in amounts or proportions from that found inwild-type or non-transformed plants, and those transcription factorslikely represent polypeptides that are used to alter the response to theenvironmental change. By way of example, it is well accepted in the artthat analytical methods based on altered expression patterns may be usedto screen for phenotypic changes in a plant far more effectively thancan be achieved using traditional methods.

Abiotic stress: adult stage chilling. Enhanced chilling tolerance mayextend the effective growth range of chilling sensitive crop species byallowing earlier planting or later harvest. Improved chilling tolerancemay be conferred by increased expression of glycerol-3-phosphateacetyltransferase in chloroplasts (see, for example, Wolter et al.(1992) et al. EMBO J. 4685-4692, and Murata et al. (1992) Nature 356:710-713).

Chilling tolerance could also serve as a model for understanding howplants adapt to water deficit. Both chilling and water stress sharesimilar signal transduction pathways and tolerance/adaptationmechanisms. For example, acclimation to chilling temperatures can beinduced by water stress or treatment with abscisic acid. Genes inducedby low temperature include dehydrins (or LEA proteins). Dehydrins arealso induced by salinity, abscisic acid, water stress, and during thelate stages of embryogenesis.

Another large impact of chilling occurs during post-harvest storage. Forexample, some fruits and vegetables do not store well at lowtemperatures (for example, bananas, avocados, melons, and tomatoes). Thenormal ripening process of the tomato is impaired if it is exposed tocool temperatures. Transcription factor genes conferring resistance tochilling temperatures, including G256, G664, and G1322 may thus enhancetolerance during post-harvest storage.

Abiotic stress: cold germination. Several of the presently disclosedtranscription factor genes confer better germination and growth in coldconditions. For example, the improved germination in cold conditionsseen with G256 and G664 indicates a role in regulation of cold responsesby these genes and their equivalogs. These genes might be engineered tomanipulate the response to low temperature stress. Genes that wouldallow germination and seedling vigor in the cold would have highlysignificant utility in allowing seeds to be planted earlier in theseason with a high rate of survival. Transcription factor genes thatconfer better survival in cooler climates allow a grower to move upplanting time in the spring and extend the growing season further intoautumn for higher crop yields. Germination of seeds and survival attemperatures significantly below that of the mean temperature requiredfor germination of seeds and survival of non-transformed plants wouldincrease the potential range of a crop plant into regions in which itwould otherwise fail to thrive.

Abiotic stress: freezing tolerance and osmotic stress. Presentlydisclosed transcription factor genes, including G47, G175, G188, G303,G325, G353, G489, G922, G926, G1069, G1089, G1452, G1820, G1852, G1930,G2053, G2140, G2153, G2379, G2701, G2719, G2789, G2839 and theirequivalogs, that increase germination rate and/or growth under adverseosmotic conditions, could impact survival and yield of seeds and plants.Osmotic stresses may be regulated by specific molecular controlmechanisms that include genes controlling water and ion movements,functional and structural stress-induced proteins, signal perception andtransduction, and free radical scavenging, and many others (Wang et al.(2001) Acta Hort. (ISHS) 560: 285-292). Instigators of osmotic stressinclude freezing, drought and high salinity, each of which are discussedin more detail below.

In many ways, freezing, high salt and drought have similar effects onplants, not the least of which is the induction of common polypeptidesthat respond to these different stresses. For example, freezing issimilar to water deficit in that freezing reduces the amount of wateravailable to a plant. Exposure to freezing temperatures may lead tocellular dehydration as water leaves cells and forms ice crystals inintercellular spaces (Buchanan, supra). As with high salt concentrationand freezing, the problems for plants caused by low water availabilityinclude mechanical stresses caused by the withdrawal of cellular water.Thus, the incorporation of transcription factors that modify a plant'sresponse to osmotic stress or improve tolerance to (e.g., by G720, G912,G913 or their equivalogs) into, for example, a crop or ornamental plant,may be useful in reducing damage or loss. Specific effects caused byfreezing, high salt and drought are addressed below.

Abiotic stress: drought and low humidity tolerance. Exposure todehydration invokes similar survival strategies in plants as doesfreezing stress (see, for example, Yelenosky (1989) Plant Physiol 89:444-451) and drought stress induces freezing tolerance (see, forexample, Siminovitch et al. (1982) Plant Physiol 69: 250-255; and Guy etal. (1992) Planta 188: 265-270). In addition to the induction ofcold-acclimation proteins, strategies that allow plants to survive inlow water conditions may include, for example, reduced surface area, orsurface oil or wax production. A number of presently disclosedtranscription factor genes, e.g., G912, G913, G1820, G1836 and G2505increase a plant's tolerance to low water conditions and, along withtheir functional equivalogs, would provide the benefits of improvedsurvival, increased yield and an extended geographic and temporalplanting range.

Abiotic stress: heat stress tolerance. The germination of many crops isalso sensitive to high temperatures. Presently disclosed transcriptionfactor genes that provide increased heat tolerance, including G464,G682, G864, G1305, G1841, G2130, G2430 and their equivalogs, would begenerally useful in producing plants that germinate and grow in hotconditions, may find particular use for crops that are planted late inthe season, or extend the range of a plant by allowing growth inrelatively hot climates.

Abiotic stress: salt. The genes in Table 9 that provide tolerance tosalt may be used to engineer salt tolerant crops and trees that canflourish in soils with high saline content or under drought conditions.In particular, increased salt tolerance during the germination stage ofa plant enhances survival and yield. Presently disclosed transcriptionfactor genes, including G22, G196, G226, G312, G482, G801, G867, G922,G1836, G2110, and their equivalogs that provide increased salt toleranceduring germination, the seedling stage, and throughout a plant's lifecycle, would find particular value for imparting survival and yield inareas where a particular crop would not normally prosper.

Nutrient uptake and utilization: nitrogen and phosphorus. Presentlydisclosed transcription factor genes introduced into plants provide ameans to improve uptake of essential nutrients, including nitrogenouscompounds, phosphates, potassium, and trace minerals. The enhancedperformance of, for example, G225, G226, G839, G1792, and otheroverexpressing lines under low nitrogen, and G545, G561, G911, G1946under low phosphorous conditions indicate that these genes and theirequivalogs can be used to engineer crops that could thrive underconditions of reduced nutrient availability. Phosphorus, in particular,tends to be a limiting nutrient in soils and is generally added as acomponent in fertilizers. Young plants have a rapid intake of phosphateand sufficient phosphate is important for yield of root crops such ascarrot, potato and parsnip.

The effect of these modifications is to increase the seedlinggermination and range of ornamental and crop plants. The utilities ofpresently disclosed transcription factor genes conferring tolerance toconditions of low nutrients also include cost savings to the grower byreducing the amounts of fertilizer needed, environmental benefits ofreduced fertilizer runoff into watersheds; and improved yield and stresstolerance. In addition, by providing improved nitrogen uptakecapability, these genes can be used to alter seed protein amounts and/orcomposition in such a way that could impact yield as well as thenutritional value and production of various food products.

A number of the transcription factor-overexpressing lines make lessanthocyanin on high sucrose plus glutamine indicates that these genescan be used to modify carbon and nitrogen status, and hence assimilatepartitioning (assimilate partitioning refers to the manner in which anessential element, such as nitrogen, is distributed among differentpools inside a plant, generally in a reduced form, for the purpose oftransport to various tissues).

Increased tolerance of plants to oxidative stress. In plants, as in allliving things, abiotic and biotic stresses induce the formation ofoxygen radicals, including superoxide and peroxide radicals. This hasthe effect of accelerating senescence, particularly in leaves, with theresulting loss of yield and adverse effect on appearance. Generally,plants that have the highest level of defense mechanisms, such as, forexample, polyunsaturated moieties of membrane lipids, are most likely tothrive under conditions that introduce oxidative stress (e.g., highlight, ozone, water deficit, particularly in combination). Introductionof the presently disclosed transcription factor genes, including G477and its equivalogs, that increase the level of oxidative stress defensemechanisms would provide beneficial effects on the yield and appearanceof plants. One specific oxidizing agent, ozone, has been shown to causesignificant foliar injury, which impacts yield and appearance of cropand ornamental plants. In addition to reduced foliar injury that wouldbe found in ozone resistant plant created by transforming plants withsome of the presently disclosed transcription factor genes, the latterhave also been shown to have increased chlorophyll fluorescence (Yu-SenChang et al. (2001) Bot. Bull. Acad. Sin. 42: 265-272).

Decreased herbicide sensitivity. Presently disclosed transcriptionfactor genes, including G343, G2133, G2517 and their equivalogs, thatconfer resistance or tolerance to herbicides (e.g., glyphosate) willfind use in providing means to increase herbicide applications withoutdetriment to desirable plants. This would allow for the increased use ofa particular herbicide in a local environment, with the effect ofincreased detriment to undesirable species and less harm to transgenic,desirable cultivars.

Knockouts of a number of the presently disclosed transcription factorgenes have been shown to be lethal to developing embryos. Thus, thesegenes are potentially useful as herbicide targets.

Hormone sensitivity. ABA plays regulatory roles in a host ofphysiological processes in all higher as well as in lower plants (Davieset al. (1991) Abscisic Acid: Physiology and Biochemistry. BiosScientific Publishers, Oxford, UK; Zeevaart et al. (1988) Ann Rev PlantPhysiol. Plant Mol. Biol. 49: 439-473; Shimizu-Sato et al. (2001) PlantPhysiol 127: 1405-1413). ABA mediates stress tolerance responses inhigher plants, is a key signal compound that regulates stomatal apertureand, in concert with other plant signaling compounds, is implicated inmediating responses to pathogens and wounding or oxidative damage (forexample, see Larkindale et al. (2002) Plant Physiol. 128: 682-695). Inseeds, ABA promotes seed development, embryo maturation, synthesis ofstorage products (proteins and lipids), desiccation tolerance, and isinvolved in maintenance of dormancy (inhibition of germination), andapoptosis (Zeevaart et al. (1988) Ann Rev Plant Physiol. Plant Mol.Biol. 49: 439-473; Davies (1991), supra; Thomas (1993) Plant Cell 5:1401-1410; and Bethke et al. (1999) Plant Cell 11: 1033-1046). ABA alsoaffects plant architecture, including root growth and morphology androot-to-shoot ratios. ABA action and metabolism is modulated not only byenvironmental signals but also by endogenous signals generated bymetabolic feedback, transport, hormonal cross-talk and developmentalstage. Manipulation of ABA levels, and hence by extension thesensitivity to ABA, has been described as a very promising means toimprove productivity, performance and architecture in plants Zeevaart(1999) in: Biochemistry and Molecular Biology of Plant Hormones,Hooykaas et al. eds, Elsevier Science pp 189-207; and Cutler et al.(1999) Trends Plant Sci. 4: 472-478).

A number of the presently disclosed transcription factor genes affectplant abscisic acid (ABA) sensitivity, including G546, G926, 1069,G1357, G1452, G1820, G2140, G2789. Thus, by affecting ABA sensitivity,these introduced transcription factor genes and their equivalogs wouldaffect cold, drought, oxidative and other stress sensitivities, plantarchitecture, and yield.

Several other of the present transcription factor genes have been usedto manipulate ethylene signal transduction and response pathways. Thesegenes can thus be used to manipulate the processes influenced byethylene, such as seed germination or fruit ripening, and to improveseed or fruit quality.

Diseases, pathogens and pests. A number of the presently disclosedtranscription factor genes have been shown to or are likely to affect aplants response to various plant diseases, pathogens and pests. Theoffending organisms include fungal pathogens Fusarium oxysporum,Botrytis cinerea, Scierotinia sclerotiorum, and Erysiphe orontii.Bacterial pathogens to which resistance may be conferred includePseudomonas syringae. Other problem organisms may potentially includenematodes, mollicutes, parasites, or herbivorous arthropods. In eachcase, one or more transformed transcription factor genes may providesome benefit to the plant to help prevent or overcome infestation, or beused to manipulate any of the various plant responses to disease. Thesemechanisms by which the transcription factors work could includeincreasing surface waxes or oils, surface thickness, or the activationof signal transduction pathways that regulate plant defense in responseto attacks by herbivorous pests (including, for example, proteaseinhibitors). Another means to combat fungal and other pathogens is byaccelerating local cell death or senescence, mechanisms used to impairthe spread of pathogenic microorganisms throughout a plant. Forinstance, the best known example of accelerated cell death is theresistance gene-mediated hypersensitive response, which causes localizedcell death at an infection site and initiates a systemic defenseresponse. Because many defenses, signaling molecules, and signaltransduction pathways are common to defense against different pathogensand pests, such as fungal, bacterial, oomycete, nematode, and insect,transcription factors that are implicated in defense responses againstthe fungal pathogens tested may also function in defense against otherpathogens and pests. These transcription factors include, for example,G28, G1792, G1880, G1919, G1950 (improved resistance or tolerance toBotrytis), G1047, G1792 (improved resistance or tolerance to Fusarium),G19, G28, G409, G1266, G1363, G1792 (improved resistance or tolerance toErysiphe), G545 (improved resistance or tolerance to Pseudomonas), G28,G1927 (improved resistance or tolerance to Sclerotinia), and theirequivalogs.

Growth regulator: sugar sensing. In addition to their important role asan energy source and structural component of the plant cell, sugars arecentral regulatory molecules that control several aspects of plantphysiology, metabolism and development (Hsieh et al. (1998) Proc. Natl.Acad. Sci. 95: 13965-13970). It is thought that this control is achievedby regulating gene expression and, in higher plants, sugars have beenshown to repress or activate plant genes involved in many essentialprocesses such as photosynthesis, glyoxylate metabolism, respiration,starch and sucrose synthesis and degradation, pathogen response,wounding response, cell cycle regulation, pigmentation, flowering andsenescence. The mechanisms by which sugars control gene expression arenot understood.

Because sugars are important signaling molecules, the ability to controleither the concentration of a signaling sugar or how the plant perceivesor responds to a signaling sugar could be used to control plantdevelopment, physiology or metabolism. For example, the flux of sucrose(a disaccharide sugar used for systemically transporting carbon andenergy in most plants) has been shown to affect gene expression andalter storage compound accumulation in seeds. Manipulation of thesucrose signaling pathway in seeds may therefore cause seeds to havemore protein, oil or carbohydrate, depending on the type ofmanipulation. Similarly, in tubers, sucrose is converted to starch whichis used as an energy store. It is thought that sugar signaling pathwaysmay partially determine the levels of starch synthesized in the tubers.The manipulation of sugar signaling in tubers could lead to tubers witha higher starch content.

Thus, the presently disclosed transcription factor genes that manipulatethe sugar signal transduction pathway, including G241, G254, G567, G680,G912, G1804, G481, G867, G1225, along with their equivalogs, may lead toaltered gene expression to produce plants with desirable traits. Inparticular, manipulation of sugar signal transduction pathways could beused to alter source-sink relationships in seeds, tubers, roots andother storage organs leading to increase in yield.

Growth regulator: C/N sensing. Nitrogen and carbon metabolism aretightly linked in almost every biochemical pathway in the plant. Carbonmetabolites regulate genes involved in N acquisition and metabolism, andare known to affect germination and the expression of photosyntheticgenes (Coruzzi et al. (2001) Plant Physiol. 125: 61-64) and hencegrowth. Early studies on nitrate reductase (NR) in 1976 showed that NRactivity could be affected by Glc/Suc (Crawford (1995) Plant Cell 7:859-886; Daniel-Vedele et al. (1996) CR Acad Sci Paris 319: 961-968).Those observations were supported by later experiments that showedsugars induce NR mRNA in dark-adapted, green seedlings (Cheng C L, etal. (1992) Proc Natl Acad Sci USA 89: 1861-1864). C and N may haveantagonistic relationships as signaling molecules; light induction of NRactivity and mRNA levels can be mimicked by C metabolites andN-metabolites cause repression of NR induction in tobacco (Vincentz etal. (1992) Plant J 3: 315-324). Gene regulation by C/N status has beendemonstrated for a number of N-metabolic genes (Stitt (1999) Curr. Opin.Plant. Biol. 2: 178-186); Coruzzi et al. (2001) supra). Thus,transcription factor genes that affect C/N sensing, such as G1816, canbe used to alter or improve germination and growth undernitrogen-limiting conditions.

Flowering time: early and late flowering. Presently disclosedtranscription factor genes that accelerate flowering, which includeG157, G180, G183, G485, G490, G590, G789, G1225, G1494, G1820, G1841,G1842, G1843, G1946, G2010, G2144, G2295, G2347, G2509, and theirfunctional equivalogs, could have valuable applications in suchprograms, since they allow much faster generation times. In a number ofspecies, for example, broccoli, cauliflower, where the reproductiveparts of the plants constitute the crop and the vegetative tissues arediscarded, it would be advantageous to accelerate time to flowering.Accelerating flowering could shorten crop and tree breeding programs.Additionally, in some instances, a faster generation time would allowadditional harvests of a crop to be made within a given growing season.A number of Arabidopsis genes have already been shown to accelerateflowering when constitutively expressed. These include LEAFY, APETALA1and CONSTANS (Mandel et al. (1995) Nature 377: 522-524; Weigel andNilsson (1995) Nature 377: et al. 495-500; Simon et al. (1996) Nature384: 59-62).

By regulating the expression of potential flowering using induciblepromoters, flowering could be triggered by application of an inducerchemical. This would allow flowering to be synchronized across a cropand facilitate more efficient harvesting. Such inducible systems couldalso be used to tune the flowering of crop varieties to differentlatitudes. At present, species such as soybean and cotton are availableas a series of maturity groups that are suitable for different latitudeson the basis of their flowering time (which is governed by day-length).A system in which flowering could be chemically controlled would allow asingle high-yielding northern maturity group to be grown at anylatitude. In southern regions such plants could be grown for longerperiods before flowering was induced, thereby increasing yields. In morenorthern areas, the induction would be used to ensure that the cropflowers prior to the first winter frosts.

In a sizeable number of species, for example, root crops, where thevegetative parts of the plants constitute the crop and the reproductivetissues are discarded, it is advantageous to identify and incorporatetranscription factor genes that delay or prevent flowering in order toprevent resources being diverted into reproductive development. Forexample, G8, G47, G157, G192, G214, G231; G361, G362, G562, G736, G748,G859, G910, G913, G971, G1051, G1052, G1357, G1452, G1478, G1804, G1895,G1945, G2007, G2133, G2155, G2838 and equivalogs, delay flowering timein transgenic plants. Extending vegetative development with presentlydisclosed transcription factor genes could thus bring about largeincreases in yields. Prevention of flowering can help maximizevegetative yields and prevent escape of genetically modified organism(GMO) pollen.

Presently disclosed transcription factors that extend flowering timehave utility in engineering plants with longer-lasting flowers for thehorticulture industry, and for extending the time in which the plant isfertile.

A number of the presently disclosed transcription factors may extendflowering time, and delay flower abscission, which would have utility inengineering plants with longer-lasting flowers for the horticultureindustry. This would provide a significant benefit to the ornamentalindustry, for both cut flowers and woody plant varieties (of, forexample, maize), as well as have the potential to lengthen the fertileperiod of a plant, which could positively impact yield and breedingprograms.

General development and morphology: flower structure and inflorescence:architecture, altered flower organs, reduced fertility multiplealterations, aerial rosettes, branching, internode distance, terminalflowers and phase change. Presently disclosed transgenic transcriptionfactors such as G353; G354, G638; G779; G988; G1063; G1075; G1140;G1449; G1499; G2143; G2557, G2838, G2839 and their equivalogs, may beused to create plants with larger flowers or arrangements of flowersthat are distinct from wild-type or non-transformed cultivars. Thiswould likely have the most value for the ornamental horticultureindustry, where larger flowers or interesting floral configurations aregenerally preferred and command the highest prices.

Flower structure may have advantageous or deleterious effects onfertility, and could be used, for example, to decrease fertility by theabsence, reduction or screening of reproductive components. In fact,plants that overexpress a sizable number of the presently disclosedtranscription factor genes e.g., G470, G779, G988, G1075, G1140, G1499,G1947, G2143, G2557 and their functional equivalogs, possess reducedfertility; flowers are infertile and fail to yield seed. These could bedesirable traits, as low fertility could be exploited to prevent orminimize the escape of the pollen of genetically modified organisms(GMOs) into the environment.

The alterations in shoot architecture seen in the lines transformed withG47, G1063, G1645, G2143, and their functional equivalogs indicates thatthese genes and their equivalogs can be used to manipulate inflorescencebranching patterns. This could influence yield and offer the potentialfor more effective harvesting techniques. For example, a “self pruning”mutation of tomato results in a determinate growth pattern andfacilitates mechanical harvesting (Pnueli et al. (2001) Plant Cell13(12): 2687-702).

One interesting application for manipulation of flower structure, forexample, by introduced transcription factors could be in the increasedproduction of edible flowers or flower parts, including saffron, whichis derived from the stigmas of Crocus sativus.

Genes that later silique conformation in brassicates may be used tomodify fruit ripening processes in brassicates and other plants, whichmay positively affect seed or fruit quality.

A number of the presently disclosed transcription factors may affect thetiming of phase changes in plants. Since the timing or phase changesgenerally affects a plant's eventual size, these genes may provebeneficial by providing means for improving yield and biomass.

General development and morphology: shoot meristem and branchingpatterns. Several of the presently disclosed transcription factor genes,including G390 and G391, and G1794, when introduced into plants, havebeen shown to cause stem bifurcations in developing shoots in which theshoot meristems split to form two or three separate shoots. Thesetranscription factors and their functional equivalogs may thus be usedto manipulate branching. This would provide a unique appearance, whichmay be desirable in ornamental applications, and may be used to modifylateral branching for use in the forestry industry. A reduction in theformation of lateral branches could reduce knot formation. Conversely,increasing the number of lateral branches could provide utility when aplant is used as a view- or windscreen.

General development and morphology: apical dominance: The modifiedexpression of presently disclosed transcription factors (e.g., G47,G211, G1255, G1275, G1411, G1488, G1794, G2509 and their equivalogs)that reduce apical dominance could be used in ornamental horticulture,for example, to modify plant architecture, for example, to produce ashorter, more bushy stature than wild type. The latter form would haveornamental utility as well as provide increased resistance to lodging.

General development and morphology: trichome density development orstructure. Several of the presently disclosed transcription factor geneshave been used to modify trichome number, density, trichome cell fate,amount of trichome products produced by plants, or produce ectopictrichome formation. These include G225; G226, G247; G362, G370; G585,G634, G676, G682, G1332, G1452, G1995, G2826, and G2838. In most caseswhere the metabolic pathways are impossible to engineer, increasingtrichome density or size on leaves may be the only way to increase plantproductivity. Thus, by increasing trichome density, size or type, thesetrichome-affecting genes and their functional equivalogs would haveprofound utilities in molecular farming practices by making use oftrichomes as a manufacturing system for complex secondary metabolites.

Trichome glands on the surface of many higher plants produce and secreteexudates that give protection from the elements and pests such asinsects, microbes and herbivores. These exudates may physicallyimmobilize insects and spores, may be insecticidal or ant-microbial orthey may act as allergens or irritants to protect against herbivores. Bymodifying trichome location, density or activity with presentlydisclosed transcription factors that modify these plant characteristics,plants that are better protected and higher yielding may be the result.

A potential application for these trichome-affecting genes and theirequivalogs also exists in cotton: cotton fibers are modified unicellulartrichomes that develop from the outer ovule epidermis. In fact, onlyabout 30% of these epidermal cells develop into trichomes, but all havethe potential to develop a trichome fate. Trichome-affecting genes cantrigger an increased number of these cells to develop as trichomes andthereby increase the yield of cotton fibers. Since the mallow family isclosely related to the Brassica family, genes involved in trichomeformation will likely have homologs in cotton or function in cotton.

If the effects on trichome patterning reflect a general change inheterochronic processes, trichome-affecting transcription factors ortheir equivalogs can be used to modify the way meristems and/or cellsdevelop during different phases of the plant life cycle. In particular,altering the timing of phase changes could afford positive effects onyield and biomass production.

General development and morphology: stem morphology and altered vasculartissue structure. Plants transformed with transcription factor genesthat modify stem morphology or lignin content may be used to affectoverall plant architecture and the distribution of lignified fiber cellswithin the stem.

Modulating lignin content might allow the quality of wood used forfurniture or construction to be improved. Lignin is energy rich;increasing lignin composition could therefore be valuable in raising theenergy content of wood used for fuel. Conversely, the pulp and paperindustries seek wood with a reduced lignin content. Currently, ligninmust be removed in a costly process that involves the use of manypolluting chemicals. Consequently, lignin is a serious barrier toefficient pulp and paper production (Tzfira et al. (1998) TIBTECH 16:439-446; Robinson (1999) Nature Biotechnology 17: 27-30). In addition toforest biotechnology applications, changing lignin content byselectively expressing or repressing transcription factors in fruits andvegetables might increase their palatability.

Transcription factors that modify stem structure, including G47, G438,G748, G988, G1488 and their equivalogs, may also be used to achievereduction of higher-order shoot development, resulting in significantplant architecture modification. Overexpression of the genes that encodethese transcription factors in woody plants might result in trees thatlack side branches, and have fewer knots in the wood. Altering branchingpatterns could also have applications amongst ornamental andagricultural crops. For example, applications might exist in any specieswhere secondary shoots currently have to be removed manually, or wherechanges in branching pattern could increase yield or facilitate moreefficient harvesting.

General development and morphology: altered root development. Bymodifying the structure or development of roots by transforming into aplant one or more of the presently disclosed transcription factor genes,including G225, G226, G1482, and their equivalogs, plants may beproduced that have the capacity to thrive in otherwise unproductivesoils. For example, grape roots extending further into rocky soils wouldprovide greater anchorage, greater coverage with increased branching, orwould remain viable in waterlogged soils, thus increasing the effectiveplanting range of the crop and/or increasing yield and survival. It maybe advantageous to manipulate a plant to produce short roots, as when asoil in which the plant will be growing is occasionally flooded, or whenpathogenic fungi or disease-causing nematodes are prevalent.

General development and morphology: seed development ripening andgermination rate. A number of the presently disclosed transcriptionfactor genes (e.g., G979) have been shown to modify seed development andgermination rate, including when the seeds are in conditions normallyunfavorable for germination (e.g., cold, heat or salt stress, or in thepresence of ABA), and may, along with functional equivalogs, thus beused to modify and improve germination rates under adverse conditions.

General development and morphology: cell differentiation and cellproliferation. Several of the disclosed transcription factors regulatecell proliferation and/or differentiation, including G1540 and itsfunctional equivalogs. Control of these processes could have valuableapplications in plant transformation, cell culture or micro-propagationsystems, as well as in control of the proliferation of particular usefultissues or cell types. Transcription factors that induce theproliferation of undifferentiated cells can be operably linked with aninducible promoter to promote the formation of callus that can be usedfor transformation or production of cell suspension cultures.Transcription factors that prevent cells from differentiating, such asG1540 or its equivalogs, could be used to confer stem cell identity tocultured cells. Transcription factors that promote differentiation ofshoots could be used in transformation or micro-propagation systems,where regeneration of shoots from callus is currently problematic. Inaddition, transcription factors that regulate the differentiation ofspecific tissues could be used to increase the proportion of thesetissues in a plant. Genes that promote the differentiation of carpeltissue could be introduced into commercial species to induce formationof increased numbers of carpels or fruits. A particular applicationmight exist in saffron, one of the world's most expensive spices.Saffron filaments, or threads, are actually the dried stigmas of thesaffron flower, Crocus sativus Linneaus. Each flower contains only threestigmas, and more than 75,000 of these flowers are needed to producejust one pound of saffron filaments. An increase in carpel number wouldincrease the quantity of stigmatic tissue and improve yield.

General development and morphology: cell expansion. Plant growth resultsfrom a combination of cell division and cell expansion. Transcriptionfactors may be useful in regulation of cell expansion. Alteredregulation of cell expansion could affect stem length, an importantagronomic characteristic. For instance, short cultivars of wheatcontributed to the Green Revolution, because plants that put fewerresources into stem elongation allocate more resources into developingseed and produce higher yield. These plants are also less vulnerable towind and rain damage. These cultivars were found to be altered in theirsensitivity to gibberellins, hormones that regulate stem elongationthrough control of both cell expansion and cell division. Altered cellexpansion in leaves could also produce novel and ornamental plant forms.

General development and morphology: phase change and floral reversion.Transcription factors that regulate phase change can modulate thedevelopmental programs of plants and regulate developmental plasticityof the shoot meristem. In particular, these genes might be used tomanipulate seasonality and influence whether plants display an annual orperennial habit.

General development and morphology: rapid development. A number of thepresently disclosed transcription factor genes, including G2430, havebeen shown to have significant effects on plant growth rate anddevelopment. These observations have included, for example, more rapidor delayed growth and development of reproductive organs. Thus, bycausing more rapid development, G2430 and its functional equivalogswould prove useful for regions with short growing seasons; othertranscription factors that delay development may be useful for regionswith longer growing seasons. Accelerating plant growth would alsoimprove early yield or increase biomass at an earlier stage, when suchis desirable (for example, in producing forestry products or vegetablesprouts for consumption). Transcription factors that promote fasterdevelopment such as G2430 and its functional equivalogs may also be usedto modify the reproductive cycle of plants.

General development and morphology: slow growth rate. A number of thepresently disclosed transcription factor genes, including G652 andG1335, have been shown to have significant effects on retarding plantgrowth rate and development. These observations have included, forexample, delayed growth and development of reproductive organs. Slowgrowing plants may be highly desirable to ornamental horticulturists,both for providing house plants that display little change in theirappearance over time, or outdoor plants for which wild-type or rapidgrowth is undesirable (e.g., ornamental palm trees). Slow growth mayalso provide for a prolonged fruiting period, thus extending theharvesting season, particularly in regions with long growing seasons.Slow growth could also provide a prolonged period in which pollen isavailable for improved self- or cross-fertilization, orcross-fertilization of cultivars that normally flower overnon-overlapping time periods. The latter aspect may be particularlyuseful to plants comprising two or more distinct grafted cultivars(e.g., fruit trees) with normally non-overlapping flowering periods.

General development and morphology: senescence. Presently disclosedtranscription factor genes may be used to alter senescence responses inplants. Although leaf senescence is thought to be an evolutionaryadaptation to recycle nutrients, the ability to control senescence in anagricultural setting has significant value. For example, a delay in leafsenescence in some maize hybrids is associated with a significantincrease in yields and a delay of a few days in the senescence ofsoybean plants can have a large impact on yield. In an experimentalsetting, tobacco plants engineered to inhibit leaf senescence had alonger photosynthetic lifespan, and produced a 50% increase in dryweight and seed yield (Gan and Amasino (1995) Science 270: 1986-1988).Delayed flower senescence caused by overexpression of transcriptionfactors may generate plants that retain their blossoms longer and thismay be of potential interest to the ornamental horticulture industry,and delayed foliar and fruit senescence could improve post-harvestshelf-life of produce.

Premature senescence caused by, for example, G636, G1463, G1944 andtheir equivalogs may be used to improve a plant's response to diseaseand hasten fruit ripening.

Growth rate and development: lethality and necrosis. Overexpression oftranscription factors, for example, G12, G24, G877, G1519 and theirequivalogs that have a role in regulating cell death may be used toinduce lethality in specific tissues or necrosis in response to pathogenattack. For example, if a transcription factor gene inducing lethalityor necrosis was specifically active in gametes or reproductive organs,its expression in these tissues would lead to ablation and subsequentmale or female sterility. Alternatively, under pathogen-regulatedexpression, a necrosis-inducing transcription factor can restrict thespread of a pathogen infection through a plant.

Plant size: large plants. Plants overexpressing G1073 and G1451, forexample, have been shown to be larger than controls. For some ornamentalplants, the ability to provide larger varieties with these genes ortheir equivalogs may be highly desirable. For many plants, includingfruit-bearing trees, trees that are used for lumber production, or treesand shrubs that serve as view or wind screens, increased statureprovides improved benefits in the forms of greater yield or improvedscreening. Crop species may also produce higher yields on largercultivars, particularly those in which the vegetative portion of theplant is edible.

Plant size: large seedlings. Presently disclosed transcription factorgenes, that produce large seedlings can be used to produce crops thatbecome established faster. Large seedlings are generally hardier, lessvulnerable to stress, and better able to out-compete weed species.Seedlings transformed with presently disclosed transcription factors,including G2346 and G2838, for example, have been shown to possesslarger cotyledons and were more developmentally advanced than controlplants. Rapid seedling development made possible by manipulatingexpression of these genes or their equivalogs is likely to reduce lossdue to diseases particularly prevalent at the seedling stage (e.g.,damping off) and is thus important for survivability of plantsgerminating in the field or in controlled environments.

Plant size: dwarfed plants. Presently disclosed transcription factorgenes, including G24; G343, G353, G354, G362, G370; G1008, G1277, G1543,G1794, G1958 and their equivalogs, for example, that can be used todecrease plant stature are likely to produce plants that are moreresistant to damage by wind and rain, have improved lodging resistance,or more resistant to heat or low humidity or water deficit. Dwarf plantsare also of significant interest to the ornamental horticultureindustry, and particularly for home garden applications for which spaceavailability may be limited.

Plant size: fruit size and number. Introduction of presently disclosedtranscription factor genes that affect fruit size will have desirableimpacts on fruit size and number, which may comprise increases in yieldfor fruit crops, or reduced fruit yield, such as when vegetative growthis preferred (e.g., with bushy ornamentals, or where fruit isundesirable, as with ornamental olive trees).

Leaf morphology: dark leaves. Color-affecting components in leavesinclude chlorophylls (generally green), anthocyanins (generally red toblue) and carotenoids (generally yellow to red). Transcription factorgenes that increase these pigments in leaves, including G674, G912,G1063, G1357, G1452, G1482, G1499, G1792, G1863, G1888, G2143, G2557,G2838 and their equivalogs, may positively affect a plant's value to theornamental horticulture industry. Variegated varieties, in particular,would show improved contrast. Other uses that result from overexpressionof transcription factor genes include improvements in the nutritionalvalue of foodstuffs. For example, lutein is an important nutraceutical;lutein-rich diets have been shown to help prevent age-related maculardegeneration (ARMD), the leading cause of blindness in elderly people.Consumption of dark green leafy vegetables has been shown in clinicalstudies to reduce the risk of ARMD.

Enhanced chlorophyll and carotenoid levels could also improve yield incrop plants. Lutein, like other xanthophylls such as zeaxanthin andviolaxanthin, is an essential component in the protection of the plantagainst the damaging effects of excessive light. Specifically, luteincontributes, directly or indirectly, to the rapid rise ofnon-photochemical quenching in plants exposed to high light. Crop plantsengineered to contain higher levels of lutein could therefore haveimproved photo-protection, leading to less oxidative damage and bettergrowth under high light (e.g., during long summer days, or at higheraltitudes or lower latitudes than those at which a non-transformed plantwould survive). Additionally, elevated chlorophyll levels increasesphotosynthetic capacity.

Leaf morphology: changes in leaf shape. Presently disclosedtranscription factors produce marked and diverse effects on leafdevelopment and shape. The transcription factors include G211, G353,G674, G736, G1063, G1146, G1357, G1452, G1494, G1543, G1863, G2143,G2144, and their equivalogs. At early stages of growth, transgenicseedlings have developed narrow, upward pointing leaves with longpetioles, possibly indicating a disruption in circadian-clock controlledprocesses or nyctinastic movements. Other transcription factor genes canbe used to alter leaf shape in a significant manner from wild type, someof which may find use in ornamental applications.

Leaf morphology: altered leaf size. Large leaves, such as those producedin plants overexpressing G189, G1451, G2430 and their functionalequivalogs, generally increase plant biomass. This provides benefit forcrops where the vegetative portion of the plant is the marketableportion.

Leaf morphology: light green and variegated leaves. Transcription factorgenes such as G635, G1494, G2144 and their equivalogs that provide analtered appearance may positively affect a plant's value to theornamental horticulture industry.

Leaf morphology: glossy leaves. Transcription factor genes such as G30,G1792, G2583 and their equivalogs that induce the formation of glossyleaves generally do so by elevating levels of epidermal wax. Thus, thegenes could be used to engineer changes in the composition and amount ofleaf surface components, including waxes. The ability to manipulate waxcomposition, amount, or distribution could modify plant tolerance todrought and low humidity, or resistance to insects or pathogens.Additionally, wax may be a valuable commodity in some species, andaltering its accumulation and/or composition could enhance yield.

Seed morphology: altered seed coloration. Presently disclosedtranscription factor genes, including G156, G2105, G2085 have also beenused to modify seed color, which, along with the equivalogs of thesegenes, could provide added appeal to seeds or seed products.

Seed morphology: altered seed size and shape. The introduction ofpresently disclosed transcription factor genes into plants that increase(e.g., G450; G584; G1255; G2085; G2105; G2114) or decrease (e.g.,G1040). the size of seeds may have a significant impact on yield andappearance, particularly when the product is the seed itself (e.g., inthe case of grains, legumes, nuts, etc.). Seed size, in addition to seedcoat integrity, thickness and permeability, seed water content and anumber of other components including antioxidants and oligosaccharides,also affects affect seed longevity in storage, with larger seeds oftenbeing more desirable for prolonged storage.

Transcription factor genes that alter seed shape, including G1040,G1062, G1255 and their equivalogs may have both ornamental applicationsand improve or broaden the appeal of seed products.

Leaf biochemistry: increased leaf wax. Overexpression of transcriptionfactors genes, including G975, G1792 and G2085 and their equivalogs,which results in increased leaf wax could be used to manipulate waxcomposition, amount, or distribution. These transcription factors canimprove yield in those plants and crops from which wax is a valuableproduct. The genes may also be used to modify plant tolerance to droughtand/or low humidity or resistance to insects, as well as plantappearance (glossy leaves). The effect of increased wax deposition onleaves of a plant like may improve water use efficiency. Manipulation ofthese genes may reduce the wax coating on sunflower seeds; this waxfouls the oil extraction system during sunflower seed processing foroil. For the latter purpose or any other where wax reduction isvaluable, antisense or co-suppression of the transcription factor genesin a tissue-specific manner would be valuable.

Leaf biochemistry: leaf prenyl lipids, including tocopherol. Prenyllipids play a role in anchoring proteins in membranes or membranousorganelles. Thus modifying the prenyl lipid content of seeds and leavescould affect membrane integrity and function. One important group ofprenyl lipids, the tocopherols, have both anti-oxidant and vitamin Eactivity. A number of presently disclosed transcription factor genes,including G214, G652, G748, G987, G1543, and G2509, have been shown tomodify the tocopherol composition of leaves in plants, and these genesand their equivalogs may thus be used to alter prenyl lipid content ofleaves.

Leaf biochemistry: increased leaf insoluble sugars. Overexpression of anumber of presently disclosed transcription factors, including G211,resulted in plants with altered leaf insoluble sugar content. Thistranscription factor and its equivalogs that alter plant cell wallcomposition have several potential applications including altering fooddigestibility, plant tensile strength, wood quality, pathogen resistanceand in pulp production. In particular, hemicellulose is not desirable inpaper pulps because of its lack of strength compared with cellulose.Thus modulating the amounts of cellulose vs. hemicellulose in the plantcell wall is desirable for the paper/lumber industry. Increasing theinsoluble carbohydrate content in various fruits, vegetables, and otheredible consumer products will result in enhanced fiber content.Increased fiber content would not only provide health benefits in foodproducts, but might also increase digestibility of forage crops. Inaddition, the hemicellulose and pectin content of fruits and berriesaffects the quality of jam and catsup made from them. Changes inhemicellulose and pectin content could result in a superior consumerproduct.

Leaf biochemistry: increased leaf anthocyanin. Several presentlydisclosed transcription factor genes may be used to alter anthocyaninproduction in numerous plant species. Expression of presently disclosedtranscription factor genes that increase flavonoid production in plants,including anthocyanins and condensed tannins, may be used to alter inpigment production for horticultural purposes, and possibly increasingstress resistance. G362, G663, G1482 and G1888 or their equivalogs, forexample, could be used to alter anthocyanin production or accumulation.A number of flavonoids have been shown to have antimicrobial activityand could be used to engineer pathogen resistance. Several flavonoidcompounds have health promoting effects such as inhibition of tumorgrowth, prevention of bone loss and prevention of the oxidation oflipids. Increased levels of condensed tannins, in forage legumes wouldbe an important agronomic trait because they prevent pasture bloat bycollapsing protein foams within the rumen. For a review on the utilitiesof flavonoids and their derivatives, refer to Dixon et al. (1999) TrendsPlant Sci. 4: 394-400.

Leaf and seed biochemistry: altered fatty acid content. A number of thepresently disclosed transcription factor genes have been shown to alterthe fatty acid composition in plants, and seeds and leaves inparticular. This modification suggests several utilities, includingimproving the nutritional value of seeds or whole plants. Dietary fattyacids ratios have been shown to have an effect on, for example, boneintegrity and remodeling (see, for example, Weiler (2000) Pediatr. Res.47:5 692-697). The ratio of dietary fatty acids may alter the precursorpools of long-chain polyunsaturated fatty acids that serve as precursorsfor prostaglandin synthesis. In mammalian connective tissue,prostaglandins serve as important signals regulating the balance betweenresorption and formation in bone and cartilage. Thus dietary fatty acidratios altered in seeds may affect the etiology and outcome of boneloss.

Transcription factors that reduce leaf fatty acids, for example, 16:3fatty acids, may be used to control thylakoid membrane development,including proplastid to chloroplast development. The genes that encodethese transcription factors might thus be useful for controlling thetransition from proplastid to chromoplast in fruits and vegetables. Itmay also be desirable to change the expression of these genes to preventcotyledon greening in Brassica napus or B. campestris to avoid green oildue to early frost.

A number of transcription factor genes are involved in mediating anaspect of the regulatory response to temperature. These genes may beused to alter the expression of desaturases that lead to production of18:3 and 16:3 fatty acids, the balance of which affects membranefluidity and mitigates damage to cell membranes and photosyntheticstructures at high and low temperatures.

Seed biochemistry: modified seed oil and fatty acid content. Thecomposition of seeds, particularly with respect to seed oil amountsand/or composition, is very important for the nutritional and caloricvalue and production of various food and feed products. Several of thepresently disclosed transcription factor genes in seed lipid saturationthat alter seed oil content could be used to improve the heat stabilityof oils or to improve the nutritional quality of seed oil, by, forexample, reducing the number of calories in seed by decreasing oil orfatty acid content (e.g., G180; G192; G241; G1229; G1323; G1543),increasing the number of calories in animal feeds by increasing oil orfatty acid content (e.g. G162; G291; G427; G590; G598; G629, G715; G849;G1198, G1471; G1526; G1640; G1646, G1750; G1777; G1793; G1838; G1902;G1946; G1948; G2123; G2138; G2830), altering seed oil content (G504;G509; G519; G561; G567; G892; G961; G974; G1143; G1226; G1451; G1478;G1496; G1672; G1677; G1765; G2509; G2343), or altering the ratio ofsaturated to unsaturated lipids comprising the oils (e.g. G869; G1417;G2192).

Seed biochemistry: modified seed protein content. As with seed oils, thecomposition of seeds, particularly with respect to protein amountsand/or composition, is very important for the nutritional value andproduction of various food and feed products. A number of the presentlydisclosed transcription factor genes modify the protein concentrationsin seeds, including G162; G226; G1323; G1419; G1818, which increase seedprotein, G427; G1777; G1903; G1946, which decrease seed protein, andG162; G241; G509; G567; G597; G849; G892; G988; G1478; G1634; G1637;G1652; G1677; G1820; G1958; G2509; G2117; G2509, which alter seedprotein content, would provide nutritional benefits, and may be used toprolong storage, increase seed pest or disease resistance, or modifygermination rates.

Seed biochemistry: seed prenyl lipids. Prenyl lipids play a role inanchoring proteins in membranes or membranous organelles. Thus,modifying the prenyl lipid content of seeds and leaves could affectmembrane integrity and function. A number of presently disclosedtranscription factor genes have been shown to modify the tocopherolcomposition of plants. α-Tocopherol is better known as vitamin E.Tocopherols such as α- and γ-tocopherol both have anti-oxidant activity.

Seed biochemistry: seed glucosinolates. A number of glucosinolates havebeen shown to have anti-cancer activity; thus, increasing the levels orcomposition of these compounds by introducing several of the presentlydisclosed transcription factors, including G484 and G2340, can have abeneficial effect on human diet.

Glucosinolates are undesirable components of the oilseeds used in animalfeed since they produce toxic effects. Low-glucosinolate varieties ofcanola, for example, have been developed to combat this problem.Glucosinolates form part of a plant's natural defense against insects.Modification of glucosinolate composition or quantity by introducingtranscription factors that affect these characteristics can thereforeafford increased protection from herbivores. Furthermore, in ediblecrops, tissue specific promoters can be used to ensure that thesecompounds accumulate specifically in tissues, such as the epidermis,which are not taken for consumption.

Seed biochemistry: increased seed anthocyanin. Several presentlydisclosed transcription factor genes may be used to alter anthocyaninproduction in the seeds of plants. As with leaf anthocyanins, expressionof presently disclosed transcription factor genes that increaseflavonoid (anthocyanins and condensed tannins) production in seeds,including G663 and its equivalogs, may be used to alter in pigmentproduction for horticultural purposes, and possibly increasing stressresistance, antimicrobial activity and health promoting effects such asinhibition of tumor growth, prevention of bone loss and prevention ofthe oxidation of lipids.

Leaf and seed biochemistry: production of seed and leaf phytosterols:Presently disclosed transcription factor genes that modify levels ofphytosterols in plants may have at least two utilities. First,phytosterols are an important source of precursors for the manufactureof human steroid hormones. Thus, regulation of transcription factorexpression or activity could lead to elevated levels of important humansteroid precursors for steroid semi-synthesis. For example,transcription factors that cause elevated levels of campesterol inleaves, or sitosterols and stigmasterols in seed crops, would be usefulfor this purpose. Phytosterols and their hydrogenated derivativesphytostanols also have proven cholesterol-lowering properties, andtranscription factor genes that modify the expression of these compoundsin plants would thus provide health benefits.

Root biochemistry: increased root anthocyanin. Presently disclosedtranscription factor genes, including G663, may be used to alteranthocyanin production in the root of plants. As described above forseed anthocyanins, expression of presently disclosed transcriptionfactor genes that increase flavonoid (anthocyanins and condensedtannins) production in seeds, including G663 and its equivalogs, may beused to alter in pigment production for horticultural purposes, andpossibly increasing stress resistance, antimicrobial activity and healthpromoting effects such as inhibition of tumor growth, prevention of boneloss and prevention of the oxidation of lipids.

Light response/shade avoidance: altered cotyledon, hypocotyl, petioledevelopment, altered leaf orientation, constitutive photomorphogenesis,photomorphogenesis in low light. Presently disclosed transcriptionfactor genes, including G183; G354; G1322; G1331; G1488; G1494; G1794;G2144; and G2555, that modify a plant's response to light may be usefulfor modifying plant growth or development, for example,photomorphogenesis in poor light, or accelerating flowering time inresponse to various light intensities, quality or duration to which anon-transformed plant would not similarly respond. Examples of suchresponses that have been demonstrated include leaf number andarrangement, and early flower bud appearances. Elimination of shadingresponses may lead to increased planting densities with subsequent yieldenhancement. As these genes may also alter plant architecture, they mayfind use in the ornamental horticulture industry.

Pigment: increased anthocyanin level in various plant organs andtissues. In addition to seed, leaves and roots, as mentioned above,several presently disclosed transcription factor genes can be used toalter anthocyanin levels in one or more tissues. The potential utilitiesof these genes include alterations in pigment production forhorticultural purposes, and possibly increasing stress resistance,antimicrobial activity and health promoting effects such as inhibitionof tumor growth, prevention of bone loss and prevention of the oxidationof lipids.

Miscellaneous biochemistry: diterpenes in leaves and other plant parts.Depending on the plant species, varying amounts of diverse secondarybiochemicals (often lipophilic terpenes) are produced and exuded orvolatilized by trichomes. These exotic secondary biochemicals, which arerelatively easy to extract because they are on the surface of the leaf,have been widely used in such products as flavors and aromas, drugs,pesticides and cosmetics. Thus, the overexpression of genes that areused to produce diterpenes in plants may be accomplished by introducingtranscription factor genes that induce said overexpression. One class ofsecondary metabolites, the diterpenes, can effect several biologicalsystems such as tumor progression, prostaglandin synthesis and tissueinflammation. In addition, diterpenes can act as insect pheromones,termite allomones, and can exhibit neurotoxic, cytotoxic and antimitoticactivities. As a result of this functional diversity, diterpenes havebeen the target of research several pharmaceutical ventures. In mostcases where the metabolic pathways are impossible to engineer,increasing trichome density or size on leaves may be the only way toincrease plant productivity.

Miscellaneous biochemistry: production of miscellaneous secondarymetabolites. Microarray data suggests that flux through the aromaticamino acid biosynthetic pathways and primary and secondary metabolitebiosynthetic pathways are up-regulated. Presently disclosedtranscription factors have been shown to be involved in regulatingalkaloid biosynthesis, in part by up-regulating the enzymesindole-3-glycerol phosphatase and strictosidine synthase. Phenylalanineammonia lyase, chalcone synthase and trans-cinnamate mono-oxygenase arealso induced, and are involved in phenylpropenoid biosynthesis.

Antisense and Co-Suppression

In addition to expression of the nucleic acids of the invention as genereplacement or plant phenotype modification nucleic acids, the nucleicacids are also useful for sense and anti-sense suppression ofexpression, e.g., to down-regulate expression of a nucleic acid of theinvention, e.g., as a further mechanism for modulating plant phenotype.That is, the nucleic acids of the invention, or subsequences oranti-sense sequences thereof, can be used to block expression ofnaturally occurring homologous nucleic acids. A variety of sense andanti-sense technologies are known in the art, e.g., as set forth inLichtenstein and Nellen (1997) Antisense Technology: A PracticalApproach IRL Press at Oxford University Press, Oxford, U.K. Antisenseregulation is also described in Crowley et al. (1985) Cell 43: 633-641;Rosenberg et al. (1985) Nature 313: 703-706; Preiss et al. (1985) Nature313: 27-32; Melton (1985) Proc. Nail. Acad. Sci. 82: 144-148; Izant andWeintraub (1985) Science 229: 345-352; and Kim and Wold (1985) Cell 42:129-138. Additional methods for antisense regulation are known in theart. Antisense regulation has been used to reduce or inhibit expressionof plant genes in, for example in European Patent Publication No.271988. Antisense RNA may be used to reduce gene expression to produce avisible or biochemical phenotypic change in a plant (Smith et al. (1988)Nature, 334: 724-726; Smith et al. (1990) Plant Mol. Biol. 14: 369-379).In general, sense or anti-sense sequences are introduced into a cell,where they are optionally amplified, e.g., by transcription. Suchsequences include both simple oligonucleotide sequences and catalyticsequences such as ribozymes.

For example, a reduction or elimination of expression (i.e., a“knock-out”) of a transcription factor or transcription factor homologpolypeptide in a transgenic plant, e.g., to modify a plant trait, can beobtained by introducing an antisense construct corresponding to thepolypeptide of interest as a cDNA. For antisense suppression, thetranscription factor or homolog cDNA is arranged in reverse orientation(with respect to the coding sequence) relative to the promoter sequencein the expression vector. The introduced sequence need not be the fulllength cDNA or gene, and need not be identical to the cDNA or gene foundin the plant type to be transformed. Typically, the antisense sequenceneed only be capable of hybridizing to the target gene or RNA ofinterest. Thus, where the introduced sequence is of shorter length, ahigher degree of homology to the endogenous transcription factorsequence will be needed for effective antisense suppression. Whileantisense sequences of various lengths can be utilized, preferably, theintroduced antisense sequence in the vector will be at least 30nucleotides in length, and improved antisense suppression will typicallybe observed as the length of the antisense sequence increases.Preferably, the length of the antisense sequence in the vector will begreater than 100 nucleotides. Transcription of an antisense construct asdescribed results in the production of RNA molecules that are thereverse complement of mRNA molecules transcribed from the endogenoustranscription factor gene in the plant cell.

Suppression of endogenous transcription factor gene expression can alsobe achieved using a ribozyme. Ribozymes are RNA molecules that possesshighly specific endoribonuclease activity. The production and use ofribozymes are disclosed in U.S. Pat. No. 4,987,071 and U.S. Pat. No.5,543,508. Synthetic ribozyme sequences including antisense RNAs can beused to confer RNA cleaving activity on the antisense RNA, such thatendogenous mRNA molecules that hybridize to the antisense RNA arecleaved, which in turn leads to an enhanced antisense inhibition ofendogenous gene expression.

Vectors in which RNA encoded by a transcription factor or transcriptionfactor homolog cDNA is over-expressed can also be used to obtainco-suppression of a corresponding endogenous gene, e.g., in the mannerdescribed in U.S. Pat. No. 5,231,020 to Jorgensen. Such co-suppression(also termed sense suppression) does not require that the entiretranscription factor cDNA be introduced into the plant cells, nor doesit require that the introduced sequence be exactly identical to theendogenous transcription factor gene of interest. However, as withantisense suppression, the suppressive efficiency will be enhanced asspecificity of hybridization is increased, e.g., as the introducedsequence is lengthened, and/or as the sequence similarity between theintroduced sequence and the endogenous transcription factor gene isincreased.

Vectors expressing an untranslatable form of the transcription factormRNA, e.g., sequences comprising one or more stop codon, or nonsensemutation) can also be used to suppress expression of an endogenoustranscription factor, thereby reducing or eliminating its activity andmodifying one or more traits. Methods for producing such constructs aredescribed in U.S. Pat. No. 5,583,021. Preferably, such constructs aremade by introducing a premature stop codon into the transcription factorgene. Alternatively, a plant trait can be modified by gene silencingusing double-strand RNA (Sharp (1999) Genes and Development 13:139-141). Another method for abolishing the expression of a gene is byinsertion mutagenesis using the T-DNA of Agrobacterium tumefaciens.After generating the insertion mutants, the mutants can be screened toidentify those containing the insertion in a transcription factor ortranscription factor homolog gene. Plants containing a single transgeneinsertion event at the desired gene can be crossed to generatehomozygous plants for the mutation. Such methods are well known to thoseof skill in the art (See for example Koncz et al. (1992) Methods inArabidopsis Research, World Scientific Publishing Co. Pte. Ltd., RiverEdge, N.J.).

Alternatively, a plant phenotype can be altered by eliminating anendogenous gene, such as a transcription factor or transcription factorhomolog, e.g., by homologous recombination Kempin et al. (1997) Nature389: 802-803).

A plant trait can also be modified by using the Cre-lox system (forexample, as described in U.S. Pat. No. 5,658,772). A plant genome can bemodified to include first and second lox sites that are then contactedwith a Cre recombinase. If the lox sites are in the same orientation,the intervening DNA sequence between the two sites is excised. If thelox sites are in the opposite orientation, the intervening sequence isinverted.

The polynucleotides and polypeptides of this invention can also beexpressed in a plant in the absence of an expression cassette bymanipulating the activity or expression level of the endogenous gene byother means, such as, for example, by ectopically expressing a gene byT-DNA activation tagging (Ichikawa et al. (1997) Nature 390 698-701;Kakimoto et al. (1996) Science 274: 982-985). This method entailstransforming a plant with a gene tag containing multiple transcriptionalenhancers and once the tag has inserted into the genome, expression of aflanking gene coding sequence becomes deregulated. In another example,the transcriptional machinery in a plant can be modified so as toincrease transcription levels of a polynucleotide of the invention (See,e.g., PCT Publications WO 96/06166 and WO 98/53057 which describe themodification of the DNA-binding specificity of zinc finger proteins bychanging particular amino acids in the DNA-binding motif).

The transgenic plant can also include the machinery necessary forexpressing or altering the activity of a polypeptide encoded by anendogenous gene, for example, by altering the phosphorylation state ofthe polypeptide to maintain it in an activated state.

Transgenic plants (or plant cells, or plant explants, or plant tissues)incorporating the polynucleotides of the invention and/or expressing thepolypeptides of the invention can be produced by a variety of wellestablished techniques as described above. Following construction of avector, most typically an expression cassette, including apolynucleotide, e.g., encoding a transcription factor or transcriptionfactor homolog, of the invention, standard techniques can be used tointroduce the polynucleotide into a plant, a plant cell, a plant explantor a plant tissue of interest. Optionally, the plant cell, explant ortissue can be regenerated to produce a transgenic plant.

The plant can be any higher plant, including gymnosperms,monocotyledonous and dicotyledenous plants. Suitable protocols areavailable for Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae(carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed,broccoli, etc.), Curcurbitaceae (melons and cucumber), Gramineae (wheat,corn, rice, barley, millet, etc.), Solanaceae (potato, tomato, tobacco,peppers, etc.), and various other crops. See protocols described inAmmirato et al., Eds., (1984) Handbook of Plant Cell Culture—CropSpecies, Macmillan Publ. Co., New York, N.Y.; Shimamoto et al. (1989)Nature 338: 274-276; Fromm et al. (1990) Bio/Technol. 8: 833-839; andVasil et al. (1990) Bio/Technol. 8: 429-434.

Transformation and regeneration of both monocotyledonous anddicotyledonous plant cells are now routine, and the selection of themost appropriate transformation technique will be determined by thepractitioner. The choice of method will vary with the type of plant tobe transformed; those skilled in the art will recognize the suitabilityof particular methods for given plant types. Suitable methods caninclude, but are not limited to: electroporation of plant protoplasts;liposome-mediated transformation; polyethylene glycol (PEG) mediatedtransformation; transformation using viruses; micro-injection of plantcells; micro-projectile bombardment of plant cells; vacuum infiltration;and Agrobacterium tumefaciens mediated transformation. Transformationmeans introducing a nucleotide sequence into a plant in a manner tocause stable or transient expression of the sequence.

Successful examples of the modification of plant characteristics bytransformation with cloned sequences which serve to illustrate thecurrent knowledge in this field of technology, and which are hereinincorporated by reference, include: U.S. Pat. Nos. 5,571,706; 5,677,175;5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526;5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042.

Following transformation, plants are preferably selected using adominant selectable marker incorporated into the transformation vector.Typically, such a marker will confer antibiotic or herbicide resistanceon the transformed plants, and selection of transformants can beaccomplished by exposing the plants to appropriate concentrations of theantibiotic or herbicide.

After transformed plants are selected and grown to maturity, thoseplants showing a modified trait are identified. The modified trait canbe any of those traits described above. Additionally, to confirm thatthe modified trait is due to changes in expression levels or activity ofthe polypeptide or polynucleotide of the invention can be determined byanalyzing mRNA expression using Northern blots, RT-PCR or microarrays,or protein expression using immunoblots or Western blots or gel shiftassays.

Integrated Systems—Sequence Identity

Additionally, the present invention may be an integrated system,computer or computer readable medium that comprises an instruction setfor determining the identity of one or more sequences in a database. Inaddition, the instruction set can be used to generate or identifysequences that meet any specified criteria. Furthermore, the instructionset may be used to associate or link certain functional benefits, suchimproved characteristics, with one or more identified sequence.

For example, the instruction set can include, e.g., a sequencecomparison or other alignment program, e.g., an available program suchas, for example, the Wisconsin Package Version 10.0, such as BLAST,FASTA, PILEUP, FINDPATTERNS or the like (GCG, Madison, Wis.). Publicsequence databases such as GenBank, EMBL, Swiss-Prot and PIR or privatesequence databases such as PHYTOSEQ sequence database (Incyte Genomics,Palo Alto, Calif.) can be searched.

Alignment of sequences for comparison can be conducted by the localhomology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482-489, by the homology alignment algorithm of Needleman and Wunsch(1970) J. Mol. Biol. 48: 443-453, by the search for similarity method ofPearson and Lipman (1988) Proc. Natl. Acad. Sci. 85: 2444-2448, bycomputerized implementations of these algorithms. After alignment,sequence comparisons between two (or more) polynucleotides orpolypeptides are typically performed by comparing sequences of the twosequences over a comparison window to identify and compare local regionsof sequence similarity. The comparison window can be a segment of atleast about 20 contiguous positions, usually about 50 to about 200, moreusually about 100 to about 150 contiguous positions. A description ofthe method is provided in Ausubel et al. supra.

A variety of methods for determining sequence relationships can be used,including manual alignment and computer assisted sequence alignment andanalysis. This later approach is a preferred approach in the presentinvention, due to the increased throughput afforded by computer assistedmethods. As noted above, a variety of computer programs for performingsequence alignment are available, or can be produced by one of skill.

One example algorithm that is suitable for determining percent sequenceidentity and sequence similarity is the BLAST algorithm, which isdescribed in Altschul et al. (1990) J. Mol. Biol. 215: 403-410. Softwarefor performing BLAST analyses is publicly available, e.g., through theNational Library of Medicine's National Center for BiotechnologyInformation (ncbi.nlm.nih; see at world wide web (www) NationalInstitutes of Health US government (gov) website). This algorithminvolves first identifying high scoring sequence pairs (HSPs) byidentifying short words of length W in the query sequence, which eithermatch or satisfy some positive-valued threshold score T when alignedwith a word of the same length in a database sequence. T is referred toas the neighborhood word score threshold (Altschul et al. supra). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when: the cumulative alignment scorefalls off by the quantity X from its maximum achieved value; thecumulative score goes to zero or below, due to the accumulation of oneor more negative-scoring residue alignments; or the end of eithersequence is reached. The BLAST algorithm parameters W, T, and Xdetermine the sensitivity and speed of the alignment. The BLASTN program(for nucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a wordlength (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff and Henikoff (1992) Proc. Natl.Acad. Sci. 89: 10915-10919). Unless otherwise indicated, “sequenceidentity” here refers to the % sequence identity generated from atblastx using the NCBI version of the algorithm at the default settingsusing gapped alignments with the filter “off” (see, for example, NIH NLMNCBI website at ncbi.nlm.nih, supra).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g. Karlin and Altschul (1993) Proc. Natl. Acad.Sci. 90: 5873-5787). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence (and, therefore, in thiscontext, homologous) if the smallest sum probability in a comparison ofthe test nucleic acid to the reference nucleic acid is less than about0.1, or less than about 0.01, and or even less than about 0.001. Anadditional example of a useful sequence alignment algorithm is PILEUP.PILEUP creates a multiple sequence alignment from a group of relatedsequences using progressive, pairwise alignments. The program can align,e.g., up to 300 sequences of a maximum length of 5,000 letters.

The integrated system, or computer typically includes a user inputinterface allowing a user to selectively view one or more sequencerecords corresponding to the one or more character strings, as well asan instruction set which aligns the one or more character strings witheach other or with an additional character string to identify one ormore region of sequence similarity. The system may include a link of oneor more character strings with a particular phenotype or gene function.Typically, the system includes a user readable output element thatdisplays an alignment produced by the alignment instruction set.

The methods of this invention can be implemented in a localized ordistributed computing environment. In a distributed environment, themethods may implemented on a single computer comprising multipleprocessors or on a multiplicity of computers. The computers can belinked, e.g. through a common bus, but more preferably the computer(s)are nodes on a network. The network can be a generalized or a dedicatedlocal or wide-area network and, in certain preferred embodiments, thecomputers may be components of an intra-net or an internet.

Thus, the invention provides methods for identifying a sequence similaror homologous to one or more polynucleotides as noted herein, or one ormore target polypeptides encoded by the polynucleotides, or otherwisenoted herein and may include linking or associating a given plantphenotype or gene function with a sequence. In the methods, a sequencedatabase is provided (locally or across an inter or intra net) and aquery is made against the sequence database using the relevant sequencesherein and associated plant phenotypes or gene functions.

Any sequence herein can be entered into the database, before or afterquerying the database. This provides for both expansion of the databaseand, if done before the querying step, for insertion of controlsequences into the database. The control sequences can be detected bythe query to ensure the general integrity of both the database and thequery. As noted, the query can be performed using a web browser basedinterface. For example, the database can be a centralized publicdatabase such as those noted herein, and the querying can be done from aremote terminal or computer across an internet or intranet.

Any sequence herein can be used to identify a similar, homologous,paralogous, or orthologous sequence in another plant. This providesmeans for identifying endogenous sequences in other plants that may beuseful to alter a trait of progeny plants, which results from crossingtwo plants of different strain.

For example, sequences that encode an ortholog of any of the sequencesherein that naturally occur in a plant with a desired trait can beidentified using the sequences disclosed herein. The plant is thencrossed with a second plant of the same species but which does not havethe desired trait to produce progeny which can then be used in furthercrossing experiments to produce the desired trait in the second plant.Therefore the resulting progeny plant contains no transgenes; expressionof the endogenous sequence may also be regulated by treatment with aparticular chemical or other means, such as EMR. Some examples of suchcompounds well known in the art include: ethylene; cytokinins; phenoliccompounds, which stimulate the transcription of the genes needed forinfection; specific monosaccharides and acidic environments whichpotentiate vir gene induction; acidic polysaccharides which induce oneor more chromosomal genes; and opines; other mechanisms include light ordark treatment (for a review of examples of such treatments, see, Winans(1992) Microbiol. Rev. 56: 12-31; Eyal et al. (1992) Plant Mol. Biol.19: 589-599; Chrispeels et al. (2000) Plant Mol. Biol. 42: 279-290;Piazza et al. (2002) Plant Physiol. 128: 1077-1086).

Table 10 lists sequences discovered to be orthologous to a number ofrepresentative transcription factors of the present invention. Thecolumn headings include the transcription factors listed by SEQ ID NO;corresponding Gene ID (GID) numbers; the species from which theorthologs to the transcription factors are derived; the type of sequence(i.e., DNA or protein) discovered to be orthologous to the transcriptionfactors; and the SEQ ID NO of the orthologs, the latter corresponding tothe ortholog SEQ ID NOs listed in the Sequence Listing.

TABLE 10 Orthologs of Representative Arabidopsis Transcription FactorGenes SEQ ID NO: of Nucleotide SEQ ID NO: Sequence type Encoding ofOrtholog or used for GID NO of Orthologous Nucleotide determinationOrthologous Arabidopsis Encoding Ortholog Species from Which (DNA orArabidopsis Transcription Ortholog GID NO Ortholog is Derived Protein)Transcription Factor Factor 468 Glycine max DNA G19  3 469 Glycine maxDNA G19  3 470 Glycine max DNA G19  3 471 Glycine max DNA G19  3 472Oryza sativa DNA G19  3 473 Oryza sativa DNA G19  3 474 Oryza sativa DNAG19  3 475 Zea mays DNA G19  3 476 Zea mays DNA G19  3 477 Glycine maxDNA G22  5 478 Glycine max DNA G22  5 491 Glycine max DNA G28  9 492Glycine max DNA G28  9 493 Glycine max DNA G28  9 494 Glycine max DNAG28  9 495 Glycine max DNA G28  9 496 Glycine max DNA G28  9 497 Glycinemax DNA G28  9 498 Glycine max DNA G28  9 499 Oryza sativa DNA G28  9500 Zea mays DNA G28  9 501 Oryza sativa PRT G28  9 502 Oryza sativa PRTG28  9 503 Mesembryanthemum PRT G28  9 crystallinum 504 G3643 Glycinemax DNA G47, G2133 11, 407 505 Oryza sativa PRT G47, G2133 11, 407 550G3450 Glycine max DNA G226, G682 37, 147 551 Glycine max DNA G226, G682 37 552 Glycine max DNA G226, G682 37, 147 553 G3448 Glycine max DNAG226, G682 37, 147 554 G3449 Glycine max DNA G226, G682 37, 147 555Oryza sativa DNA G226, G682 37, 147 556 G3431 Zea mays DNA G226, G68237, 147 557 Zea mays DNA G226, G682 37, 147 558 Oryza sativa PRT G226,G682 37, 147 559 G3393 Oryza sativa PRT G226, G682 37, 147 610 Glycinemax DNA G353, G354 59, 61 611 Glycine max DNA G353, G354 59, 61 612Glycine max DNA G353, G354 59, 61 613 Oryza sativa DNA G353, G354 59, 61614 Zea mays DNA G353, G354 59, 61 615 Zea mays DNA G353, G354 59, 61616 Zea mays DNA G353, G354 59, 61 617 Zea mays DNA G353, G354 59, 61618 Zea mays DNA G353, G354 59, 61 619 Zea mays DNA G353, G354 59, 61620 Zea mays DNA G353, G354 59, 61 621 Oryza sativa PRT G353, G354 59,61 622 Oryza sativa PRT G353, G354 59, 61 623 Oryza sativa PRT G353,G354 59, 61 624 Oryza sativa PRT G353, G354 59, 61 625 Oryza sativa PRTG353, G354 59, 61 626 Oryza sativa PRT G353, G354 59, 61 746 Glycine maxDNA G481, G482 87, 89 747 Glycine max DNA G481, G482 87, 89 748 Glycinemax DNA G481, G482 87, 89 749 G3476 Glycine max DNA G481, G482 87, 89750 Glycine max DNA G481, G482 87, 89 751 Glycine max DNA G481, G482 87,89 752 Glycine max DNA G481, G482 87, 89 753 Glycine max DNA G481, G48287, 89 754 Glycine max DNA G481, G482  87 755 Glycine max DNA G481, G482 87 756 Oryza sativa DNA G481, G482  87 757 Oryza sativa DNA G481, G48287, 89 758 Zea mays DNA G481, G482  87 759 Zea mays DNA G481, G482 87,89 760 Zea mays DNA G481, G482 87, 89 761 Zea mays DNA G481, G482 87, 89762 Zea mays DNA G481, G482 87, 89 763 Zea mays DNA G481, G482 87, 89764 Zea mays DNA G481, G482 87, 89 765 Zea mays DNA G481, G482 87, 89766 Zea mays DNA G481, G482 87, 89 767 Zea mays DNA G481, G482 87, 89768 Gossypium DNA G481, G482 87, 89 arboreum 769 Glycine max DNA G481,G482 87, 89 770 Gossypium hirsutum DNA G481, G482 87, 89 771Lycopersicon DNA G481, G482 87, 89 esculentum 772 Lycopersicon DNA G481,G482 87, 89 esculentum 773 Medicago DNA G481, G482 87, 89 truncatula 774Lycopersicon DNA G481, G482 87, 89 esculentum 775 Solanum tuberosum DNAG481, G482 87, 89 776 Triticum aestivum DNA G481, G482 87, 89 777Hordeum vulgare DNA G481, G482 87, 89 778 Triticum DNA G481, G482 87, 89monococcum 779 Glycine max DNA G481, G482  89 780 Oryza sativa PRT G481,G482 87, 89 781 Oryza sativa PRT G481, G482 87, 89 782 Oryza sativa PRTG481, G482 87, 89 783 Oryza sativa PRT G481, G482 87, 89 784 Oryzasativa PRT G481, G482 87, 89 785 Zea mays PRT G481, G482 87, 89 786 Zeamays PRT G481, G482 87, 89 787 Oryza sativa PRT G481, G482 87, 89 788Oryza sativa PRT G481, G482 87, 89 789 Oryza sativa PRT G481, G482 87,89 790 G3395 Oryza sativa PRT G481, G482 87, 89 791 Oryza sativa PRTG481, G482 87, 89 792 Oryza sativa PRT G481, G482 87, 89 793 Oryzasativa PRT G481, G482 87, 89 794 G3397 Oryza sativa PRT G481, G482 87,89 795 Oryza sativa PRT G481, G482 87, 89 796 G3398 Oryza sativa PRTG481, G482 87, 89 797 Glycine max PRT G481, G482 87, 89 798 G3476Glycine max PRT G481, G482 87, 89 799 Glycine max PRT G481, G482 87, 89800 G3475 Glycine max PRT G481, G482 87, 89 801 G3472 Glycine max PRTG481, G482 87, 89 802 Glycine max PRT G481, G482 87, 89 803 Glycine maxPRT G481, G482 87, 89 804 Zea mays PRT G481, G482 87, 89 805 G3436 Zeamays PRT G481, G482 87, 89 806 G3434 Zea mays PRT G481, G482 87, 89 807Zea mays PRT G481, G482 87, 89 825 Glycine max DNA G489  93 826 Glycinemax DNA G489  93 827 Glycine max DNA G489  93 828 Glycine max DNA G489 93 829 Glycine max DNA G489  93 830 Glycine max DNA G489  93 831Glycine max DNA G489  93 832 Oryza sativa DNA G489  93 833 Oryza sativaDNA G489  93 834 Zea mays DNA G489  93 835 Oryza sativa PRT G489  93 836Oryza sativa PRT G489  93 837 Oryza sativa PRT G489  93 981 Oryza sativaDNA G634 127 982 Oryza sativa DNA G634 127 983 Oryza sativa DNA G634 127984 Zea mays DNA G634 127 985 Zea mays DNA G634 127 986 Zea mays DNAG634 127 987 Oryza sativa PRT G634 127 988 Oryza sativa PRT G634 1271076 G3450 Glycine max DNA G682 147 1077 Hordeum vulgare DNA G682 147subsp. vulgare 1078 Populus tremula x Populus DNA G682 147 tremuloides1079 Triticum aestivum DNA G682 147 1080 Gossypium DNA G682 147 arboreum1081 Oryza sativa PRT G682 147 1082 G3392 Oryza sativa PRT G682 147 1083G3445 Glycine max PRT G682 147 1084 G3450 Glycine max PRT G682 147 1085Glycine max PRT G682 147 1086 G3446 Glycine max PRT G682 147 1087 G3448Glycine max PRT G682 147 1088 G3449 Glycine max PRT G682 147 1089 G3431Zea mays PRT G682 147 1090 Zea mays PRT G682 147 1154 Glycine max DNAG864 167 1155 Glycine max DNA G864 167 1156 Zea mays DNA G864 167 1157Oryza sativa PRT G864 167 1158 Oryza sativa PRT G864 167 1159 Glycinemax DNA G867, G1930 169, 369 1160 G3451 Glycine max DNA G867, G1930 169,369 1161 Glycine max DNA G867, G1930 169, 369 1162 G3452 Glycine max DNAG867, G1930 169, 369 1163 Glycine max DNA G867, G1930 169, 369 1164Glycine max DNA G867, G1930 169 1165 Oryza sativa DNA G867, G1930 1691166 Oryza sativa DNA G867, G1930 169, 369 1167 Zea mays DNA G867, G1930169, 369 1168 Zea mays DNA G867, G1930 169, 369 1169 Zea mays DNA G867,G1930 169, 369 1170 Zea mays DNA G867, G1930 169, 369 1171 Glycine maxDNA G867, G1930 169, 369 1172 Mesembryanthemum DNA G867, G1930 169, 369crystallinum 1173 Lycopersicon DNA G867, G1930 169, 369 esculentum 1174Solanum tuberosum DNA G867, G1930 169, 369 1175 Hordeum vulgare DNAG867, G1930 169, 369 1176 G3388 Oryza sativa PRT G867, G1930 169, 3691177 G3389 Oryza sativa PRT G867, G1930 169, 369 1178 Oryza sativa PRTG867, G1930 169, 369 1179 G3390 Oryza sativa PRT G867, G1930 169, 3691180 Oryza sativa PRT G867, G1930 169, 369 1181 Oryza sativa PRT G867,G1930 169, 369 1182 G3451 Glycine max PRT G867, G1930 169, 369 1183G3452 Glycine max PRT G867, G1930 169, 369 1184 G3453 Glycine max PRTG867, G1930 169, 369 1185 G3433 Zea mays PRT G867, G1930 169, 369 1186Zea mays PRT G867, G1930 169, 369 1204 Glycine max DNA G912 185 1205Glycine max DNA G912 185 1206 Glycine max DNA G912 185 1207 Glycine maxDNA G912 185 1208 Glycine max DNA G912 185 1209 Glycine max DNA G912 1851210 Glycine max DNA G912 185 1211 Oryza sativa DNA G912 185 1212 Oryzasativa DNA G912, G913 185, 187 1213 G3440 Zea mays DNA G912 185 1214 Zeamays DNA G912 185 1215 Zea mays DNA G912, G913 185, 187 1216 Zea maysDNA G912 185 1217 Zea mays DNA G912 185 1218 Brassica napus DNA G912,G913 185, 187 1219 Solanum tuberosum DNA G912 185 1220 Descurainiasophia DNA G912 185 1221 G3377 Oryza sativa PRT G912 185 1222 G3373Oryza sativa PRT G912, G913 185, 187 1223 G3375 Oryza sativa PRT G912,G913 185, 187 1224 G3372 Oryza sativa PRT G912 185 1225 Brassica napusPRT G912 185 1226 Nicotiana tabacum PRT G912 185 1227 Oryza sativa PRTG912 185 1228 Oryza sativa PRT G912 185 1229 Oryza sativa PRT G912 1851230 G3379 Oryza sativa PRT G912 185 1231 Oryza sativa PRT G912 185 1232G3376 Oryza sativa PRT G912 185 1233 Oryza sativa PRT G912 185 1234Oryza sativa PRT G912 185 1235 Oryza sativa PRT G912 185 1236 Oryzasativa PRT G912 185 1237 Glycine max PRT G912 185 1238 Glycine max PRTG912 185 1239 Glycine max PRT G912 185 1240 Glycine max PRT G912 1851241 Glycine max PRT G912 185 1242 Glycine max PRT G912 185 1243 Glycinemax PRT G912 185 1244 Zea mays PRT G912 185 1245 Zea mays PRT G912 1851246 G3440 Zea mays PRT G912 185 1247 Zea mays PRT G912 185 1248 Zeamays PRT G912 185 1249 Glycine max DNA G922 189 1250 G3811 Glycine maxDNA G922 189 1251 Glycine max DNA G922 189 1252 Oryza sativa DNA G922189 1253 Oryza sativa DNA G922 189 1254 Oryza sativa PRT G922 189 1255Oryza sativa PRT G922 189 1256 Oryza sativa PRT G922 189 1257 G3813Oryza sativa PRT G922 189 1258 Glycine max DNA G926 191 1259 Glycine maxDNA G926 191 1260 Oryza sativa DNA G926 191 1261 Oryza sativa DNA G926191 1262 Zea mays DNA G926 191 1263 Brassica napus PRT G926 191 1292Glycine max DNA G975, G2583 199, 449 1293 Glycine max DNA G975, G2583199, 449 1294 Glycine max DNA G975, G2583 199, 449 1295 Glycine max DNAG975, G2583 199, 449 1296 Glycine max DNA G975, G2583 199, 449 1297Oryza sativa DNA G975 199 1298 Oryza sativa DNA G975, G2583 199, 4491299 Zea mays DNA G975, G2583 199, 449 1300 Zea mays DNA G975, G2583199, 449 1301 Brassica rapa DNA G975, G2583 199, 449 1302 Oryza sativaPRT G975, G2583 199, 449 1393 Glycine max DNA G1069, G2153 221, 417 1394Glycine max DNA G1069, G2153 221, 417 1395 Oryza sativa PRT G1069,G1073, 221, 223, 417 G2153 1396 Zea mays DNA G1069, G2153 221, 417 1397Lotus japonicus DNA G1069, G2153 221, 417 1398 Lycopersicon DNA G1073223 esculentum 1399 G3399 Oryza sativa PRT G1073 223 1400 Oryza sativaPRT G1073 223 1401 G3400 Oryza sativa PRT G1073 223 1402 Oryza sativaPRT G1073 223 1403 Oryza sativa PRT G1073 223 1404 Oryza sativa PRTG1073 223 1405 Oryza sativa PRT G1073 223 1406 Oryza sativa PRT G1073223 1407 Oryza sativa PRT G1073 223 1408 Oryza sativa PRT G1073 223 1409Oryza sativa PRT G1073 223 1410 Oryza sativa PRT G1073 223 1411 Glycinemax PRT G1073 223 1412 Glycine max PRT G1073 223 1413 Glycine max PRTG1073 223 1414 Glycine max PRT G1073 223 1415 Glycine max PRT G1073 2231416 Glycine max PRT G1073 223 1417 Glycine max PRT G1073 223 1418 Zeamays PRT G1073 223 1419 Glycine max DNA G1075 225 1420 Glycine max DNAG1075 225 1421 Glycine max DNA G1075 225 1422 Glycine max DNA G1075 2251423 Glycine max DNA G1075 225 1424 Oryza sativa DNA G1075 225 1425Oryza sativa DNA G1075 225 1426 Oryza sativa DNA G1075 225 1587 Glycinemax DNA G1411, G2509 269, 439 1588 Glycine max DNA G1411, G2509 269, 4391589 Glycine max DNA G1411, G2509 269, 439 1590 Glycine max DNA G1411,G2509 269, 439 1591 Zea mays DNA G1411, G2509 269, 439 1604 Glycine maxDNA G1451 277 1605 Glycine max DNA G1451 277 1606 Oryza sativa DNA G1451277 1607 Oryza sativa DNA G1451 277 1608 Oryza sativa DNA G1451 277 1609Zea mays DNA G1451 277 1610 Zea mays DNA G1451 277 1611 Zea mays DNAG1451 277 1612 Zea mays DNA G1451 277 1613 Medicago DNA G1451 277truncatula 1614 Solanum tuberosum DNA G1451 277 1615 Zea mays DNA G1451277 1616 Sorghum DNA G1451 277 propinquum 1617 Glycine max DNA G1451 2771618 Sorghum bicolor DNA G1451 277 1619 Hordeum vulgare DNA G1451 2771620 Lycopersicon DNA G1451 277 esculentum 1621 Oryza sativa PRT G1451277 1622 Oryza sativa PRT G1451 277 1623 Oryza sativa PRT G1451 277 1624Oryza sativa PRT G1451 277 1671 Glycine max DNA G1543 303 1672 Oryzasativa DNA G1543 303 1673 Zea mays DNA G1543 303 1674 Oryza sativa PRTG1543 303 1728 Glycine max DNA G1792 331 1729 Glycine max DNA G1792 3311730 Glycine max DNA G1792 331 1731 Glycine max DNA G1792 331 1732Glycine max DNA G1792 331 1733 Zea mays DNA G1792 331 1734 LycopersiconDNA G1792 331 esculentum 1735 G3380 Oryza sativa PRT G1792 331 1736G3381 Oryza sativa indica PRT G1792 331 1737 G3383 Oryza sativa PRTG1792 331 japonica 1795 Oryza sativa DNA G1930 369 1908 Medicago DNAG2155 419 truncatula 1909 Medicago DNA G2155 419 truncatula 1910 Glycinemax DNA G2155 419 2907 G3472 Glycine max DNA G481, G482 87, 89 2908G3475 Glycine max DNA G481, G482 87, 89 2909 G3476 Glycine max DNA G481,G482 87, 89 2910 G3395 Oryza sativa DNA G481, G482 87, 89 2911 G3397Oryza sativa DNA G481, G482 87, 89 2912 G3398 Oryza sativa DNA G481,G482 87, 89 2913 G3434 Zea mays DNA G481, G482 87, 89 2914 G3436 Zeamays DNA G481, G482 87, 89 2915 G3445 Glycine max DNA G682 147 2916G3446 Glycine max DNA G682 147 2917 G3448 Glycine max DNA G682 147 2918G3449 Glycine max DNA G682 147 2919 G3450 Glycine max DNA G682 147 2920G3392 Oryza sativa DNA G682 147 2921 G3393 Oryza sativa DNA G226, G68237, 147 2922 G3431 Zea mays DNA G682 147 2923 G3451 Glycine max DNAG867, G1930 169, 369 2924 G3452 Glycine max DNA G867, G1930 169, 3692925 G3453 Glycine max DNA G867, G1930 169, 369 2926 G3388 Oryza sativaDNA G867, G1930 169, 369 2927 G3389 Oryza sativa DNA G867, G1930 169,369 2928 G3390 Oryza sativa DNA G867, G1930 169, 369 2929 G3433 Zea maysDNA G867, G1930 169, 369 2930 G3376 Oryza sativa DNA G912 185 2931 G3372Oryza sativa DNA G912 185 2932 G3373 Oryza sativa DNA G912, G913 185,187 2933 G3375 Oryza sativa DNA G912, G913 185, 187 2934 G3377 Oryzasativa DNA G912 185 2935 G3379 Oryza sativa DNA G912 185 2936 G3440 Zeamays DNA G912 185 2937 G3399 Oryza sativa DNA G1073 223 2938 G3400 Oryzasativa DNA G1073 223 2939 G3380 Oryza sativa DNA G1792 331 2940 G3381Oryza sativa indica DNA G1792 331 2941 G3383 Oryza sativa DNA G1792 331japonica 2942 G3643 Glycine max PRT G47, G2133 11, 407 2943 G3811Glycine max PRT G922 189 2944 G3813 Oryza sativa DNA G922 189 2945 G3429Oryza sativa DNA G481, G482 87, 89 2946 G3429 Oryza sativa PRT G481,G482 87, 89 2947 G3470 Glycine max DNA G481, G482 87, 89 2948 G3470Glycine max PRT G481, G482 87, 89 2949 G3471 Glycine max DNA G481, G48287, 89 2950 G3471 Glycine max PRT G481, G482 87, 89

Table 11 lists a summary of homologous sequences identified using BLAST(tblastx program). The first column shows the polynucleotide sequenceidentifier (SEQ ID NO), the second column shows the corresponding cDNAidentifier (Gene ID), the third column shows the orthologous orhomologous polynucleotide GenBank Accession Number (Test Sequence ID),the fourth column shows the calculated probability value that thesequence identity is due to chance (Smallest Sum Probability), the fifthcolumn shows the plant species from which the test sequence was isolated(Test Sequence Species), and the sixth column shows the orthologous orhomologous test sequence GenBank annotation (Test Sequence GenBankAnnotation).

TABLE 11 Summary of representative sequences that are homologous topresently disclosed transcription factors SEQ Smallest ID Test SequenceSum Test Sequence Test Sequence GenBank NO: GID ID Probability SpeciesAnnotation 3 G19 BG321358 1.00E−101 Descurainia sophia Ds01_07d03_RDs01_AAFC_ECORC_cold_stress 3 G19 BH444831 1.00E−77 Brassica oleraceaBOHPW42TR BOHP Brassica oleracea genomic 3 G19 BM412184 2.00E−43Lycopersicon EST586511 tomato breaker esculentum fruit Lyco 3 G19BU837697 3.00E−43 Populus tremula x Populus T104G02 Populus apicatremuloides 3 G19 CA784650 6.00E−43 Glycine max sat87a10.y1 Gm-c1062Glycine max cDNA clone SOY 3 G19 BU819833 3.00E−41 Populus tremulaUA48BPB07 Populus tremula cambium cDNA libr 3 G19 BU870388 4.00E−41Populus balsamifera Q011H05 Populus flow subsp. trichocarpa 3 G19CA797119 1.00E−38 Theobroma cacao Cac_BL_4204 Cac_BL (Bean and Leaf fromAmel 3 G19 BI436183 2.00E−38 Solanum tuberosum EST538944 cSTE Solanumtuberosum cDNA clo 3 G19 BQ989448 2.00E−36 Lactuca sativaQGF17L05.yg.ab1 QG_EFGHJ lettuce serriola La 3 G19 gi10798644 5.70E−36Nicotiana tabacum AP2 domain-containing transcription fac 3 G19gi6176534 2.40E−35 Oryza sativa EREBP-like protein. 3 G19 gi16882337.50E−34 Solanum tuberosum DNA binding protein homolog. 3 G19 gi220740461.50E−33 Lycopersicon transcription factor JERF1. esculentum 3 G19gi18496063 4.90E−33 Fagus sylvatica ethylene responsive element bindingprote 3 G19 gi20805105 2.10E−32 Oryza sativa contains ESTs AU06(japonica cultivar- group) 3 G19 gi24940524 2.30E−31 Triticum aestivumethylene response element binding prote 3 G19 gi18266198 2.30E−31Narcissus AP-2 domain containing pseudonarcissus protein. 3 G19gi3264767 1.30E−30 Prunus armeniaca AP2 domain containing protein. 3 G19gi24817250 4.00E−28 Cicer arietinum transcription factor EREBP- likeprotein. 5 G22 AB016264 9.00E−48 Nicotiana sylvestris nserf2 gene forethylene- responsive el 5 G22 TOBBY4A 1.00E−47 Nicotiana tabacum mRNAfor ERF1, complete cds. 5 G22 AP004533 4.00E−47 Lotus japonicus genomicDNA, chromosome 3, clone: LjT14G02, 5 G22 LEU89255 6.00E−47 LycopersiconDNA-binding protein Pti4 esculentum mRNA, comp 5 G22 BQ517082 6.00E−46Solanum tuberosum EST624497 Generation of a set of potato c 5 G22BE449392 1.00E−45 Lycopersicon EST356151 L. hirsutum hirsutum trichome,Corne 5 G22 AF245119 5.00E−45 Mesembryanthemum AP2-related transcriptionfac crystallinum 5 G22 BQ165291 7.00E−45 Medicago truncatula EST611160KVKC Medicago truncatula cDNA 5 G22 AW618245 8.00E−38 LycopersiconEST314295 L. pennellii pennellii trichome, Cor 5 G22 BG444654 2.00E−36Gossypium arboreum GA_Ea0025B11f Gossypium arboreum 7-10 d 5 G22gi1208495 6.10E−48 Nicotiana tabacum ERF1. 5 G22 gi3342211 3.30E−47Lycopersicon Pti4. esculentum 5 G22 gi8809571 8.90E−47 Nicotianasylvestris ethylene-responsive element binding 5 G22 gi17385636 2.70E−36Matricaria ethylene-responsive element chamomilla binding 5 G22gi8980313 2.50E−33 Catharanthus roseus AP2-domain DNA-binding protein. 5G22 gi7528276 8.60E−33 Mesembryanthemum AP2-related transcription fcrystallinum 5 G22 gi21304712 3.10E−28 Glycine max ethylene-responsiveelement binding protein 1 5 G22 gi14140141 1.50E−26 Oryza sativaputative AP2-related transcription factor. 5 G22 gi15623863 1.30E−22Oryza sativa contains EST~hypot (japonica cultivar- group) 5 G22gi4099914 3.10E−21 Stylosanthes hamata ethylene-responsive elementbinding p 9 G28 AF245119 2.00E−72 Mesembryanthemum AP2-relatedtranscription fac crystallinum 9 G28 BQ165291 1.00E−68 Medicagotruncatula EST611160 KVKC Medicago truncatula cDNA 9 G28 AB0162641.00E−57 Nicotiana sylvestris nserf2 gene for ethylene- responsive el 9G28 TOBBY4D 2.00E−57 Nicotiana tabacum Tobacco mRNA for EREBP- 2,complete cds. 9 G28 BQ047502 2.00E−57 Solanum tuberosum EST596620 P.infestans- challenged potato 9 G28 LEU89255 2.00E−56 LycopersiconDNA-binding protein Pti4 esculentum mRNA, comp 9 G28 BH454277 2.00E−54Brassica oleracea BOGSI45TR BOGS Brassica oleracea genomic 9 G28BE449392 1.00E−53 Lycopersicon EST356151 L. hirsutum hirsutum trichome,Corne 9 G28 AB035270 2.00E−50 Matricaria McEREBP1 mRNA for chamomillaethylene-responsive 9 G28 AW233956 5.00E−50 Glycine max sf32e02.y1Gm-c1028 Glycine max cDNA clone GENO 9 G28 gi7528276 6.10E−71Mesembryanthemum AP2-related transcription f crystallinum 9 G28gi8809571 3.30E−56 Nicotiana sylvestris ethylene-responsive elementbinding 9 G28 gi3342211 4.20E−56 Lycopersicon Pti4. esculentum 9 G28gi1208498 8.70E−56 Nicotiana tabacum EREBP-2. 9 G28 gi14140141 4.20E−49Oryza sativa putative AP2-related transcription factor. 9 G28 gi173856363.00E−46 Matricaria ethylene-responsive element chamomilla binding 9 G28gi21304712 2.90E−31 Glycine max ethylene-responsive element bindingprotein 1 9 G28 gi15623863 5.60E−29 Oryza sativa contains EST~hypot(japonica cultivar- group) 9 G28 gi8980313 1.20E−26 Catharanthus roseusAP2-domain DNA-binding protein. 9 G28 gi4099921 3.10E−21 Stylosantheshamata EREBP-3 homolog. 11 G47 BG543936 1.00E−60 Brassica rapa subsp.E1686 Chinese cabbage etiol pekinensis 11 G47 BH420519 3.00E−43 Brassicaoleracea BOGUH88TF BOGU Brassica oleracea genomic 11 G47 AU2926033.00E−30 Zinnia elegans AU292603 zinnia cultured mesophyll cell equa 11G47 BE320193 1.00E−24 Medicago truncatula NF024B04RT1F1029 Developingroot Medica 11 G47 AAAA01000718 1.00E−22 Oryza sativa (indica ( )scaffold000718 cultivar-group) 11 G47 AP003379 2.00E−22 Oryza sativachromosome 1 clone P0408G07, *** SEQUENCING IN 11 G47 AC124836 8.00E−21Oryza sativa ( ) chromosome 5 clo (japonica cultivar- group) 11 G47BZ403609 2.00E−20 Zea mays OGABN17TM ZM_0.7_1.5_KB Zea mays genomicclone ZMM 11 G47 BM112772 6.00E−17 Solanum tuberosum EST560308 potatoroots Solanum tuberosum 11 G47 BQ698717 1.00E−16 Pinus taedaNXPV_148_C06_F NXPV (Nsf Xylem Planings wood Ve 11 G47 gi201612396.90E−24 Oryza sativa hypothetical prote (japonica cultivar- group) 11G47 gi14140155 6.80E−17 Oryza sativa putative AP2 domain transcriptionfactor. 11 G47 gi21908034 7.00E−15 Zea mays DRE binding factor 2. 11 G47gi20303011 1.90E−14 Brassica napus CBF-like protein CBF5. 11 G47gi8571476 3.00E−14 Atriplex hortensis apetala2 domain-containingprotein. 11 G47 gi8980313 2.10E−13 Catharanthus roseus AP2-domainDNA-binding protein. 11 G47 gi19071243 4.40E−13 Hordeum vulgare CRT/DREbinding factor 1. 11 G47 gi18650662 5.60E−13 Lycopersicon ethyleneresponse factor 1. esculentum 11 G47 gi17385636 1.20E−12 Matricariaethylene-responsive element chamomilla binding 11 G47 gi1208498 1.50E−12Nicotiana tabacunm EREBP-2. 37 G226 BU872107 2.00E−21 Populusbalsamifera Q039C07 Populus flow subsp. trichocarpa 37 G226 BU8318492.00E−21 Populus tremula x Populus T026E01 Populus apica tremuloides 37G226 BM437313 9.00E−21 Vitis vinifera VVA017F06_54121 An expressedsequence tag da 37 G226 BI699876 1.00E−19 Glycine max sag49b09.y1Gm-c1081 Glycine max cDNA clone GEN 37 G226 AL750151 4.00E−16 Pinuspinaster AL750151 AS Pinus pinaster cDNA clone AS06C1 37 G226 CA7440132.00E−12 Triticum aestivum wri1s.pk006.m22 wri1s Triticum aestivum c 37G226 BH961028 3.00E−12 Brassica oleracea odj30d06.g1 B.oleracea002Brassica olerac 37 G226 BJ472717 8.00E−12 Hordeum vulgare BJ472717 K.Sato subsp. vulgare unpublished 37 G226 BF617445 8.00E−12 Hordeumvulgare HVSMEc0017G08f Hordeum vulgare seedling sho 37 G226 CA7622992.00E−11 Oryza sativa (indica BR060003B10F03.abl IRR cultivar-group) 37G226 gi9954118 2.20E−11 Solanum tuberosum tuber-specific and sucrose-responsive e 37 G226 gi14269333 2.50E−10 Gossypium raimondii myb-liketranscription factor Myb 3. 37 G226 gi14269335 2.50E−10 Gossypiummyb-like transcription factor herbaceum Myb 3. 37 G226 gi142693372.50E−10 Gossypium hirsutum myb-like transcription factor Myb 3. 37 G226gi23476297 2.50E−10 Gossypioides kirkii myb-like transcription factor 3.37 G226 gi15082210 8.50E−10 Fragaria x ananassa transcription factorMYB1. 37 G226 gi19072770 8.50E−10 Oryza sativa typical P-type R2R3 Mybprotein. 37 G226 gi15042108 1.40E−09 Zea mays subsp. CI protein.parviglumis 37 G226 gi15042124 1.40E−09 Zea luxurians CI protein. 37G226 gi20514371 1.40E−09 Cucumis sativus werewolf. 59 G353 BQ7908315.00E−68 Brassica rapa subsp. E4675 Chinese cabbage etiol pekinensis 59G353 BZ019752 1.00E−67 Brassica oleracea oed85c06.g1 B.oleracea002Brassica olerac 59 G353 L46574 6.00E−40 Brassica rapa BNAF1975 Mustardflower buds Brassica rapa cD 59 G353 AB006601 7.00E−26 Petunia x hybridamRNA for ZPT2-14, complete cds. 59 G353 BM437146 2.00E−25 Vitis viniferaVVA015A06_53787 An expressed sequence tag da 59 G353 BI422808 1.00E−24Lycopersicon EST533474 tomato callus, esculentum TAMU Lycop 59 G353BU867080 1.00E−24 Populus tremula x Populus S074B01 Populus imbibtremuloides 59 G353 BM527789 3.00E−23 Glycine max sal65h07.y1 Gm-c1061Glycine max cDNA clone SOY 59 G353 BQ980246 5.00E−23 Lactuca sativaQGE10I12.yg.abl QG_EFGHJ lettuce serriola La 59 G353 BQ121106 2.00E−22Solanum tuberosum EST606682 mixed potato tissues Solanum tu 59 G353gi2346976 6.50E−28 Petunia x hybrida ZPT2-13. 59 G353 gi156238204.40E−25 Oryza sativa hypothetical protein. 59 G353 gi21104613 1.40E−18Oryza sativa contains ESTs AU07 (japonica cultivar- group) 59 G353gi485814 3.10E−13 Triticum aestivum WZF1. 59 G353 gi7228329 4.00E−12Medicago sativa putative TFIIIA (or kruppel)- like zinc fi 59 G353gi1763063 1.70E−11 Glycine max SCOF-1. 59 G353 gi2981169 2.60E−11Nicotiana tabacum osmotic stress-induced zinc- finger prot 59 G353gi4666360 1.10E−10 Datisca glomerata zinc-finger protein 1. 59 G353gi2129892 2.30E−08 Pisum sativum probable finger protein Pszf1 - gardenpea. 59 G353 gi2058504 0.00018 Brassica rapa zinc-finger protein-1. 61G354 BZ083260 5.00E−49 Brassica oleracea lle29f02.g1 B.oleracea002Brassica olerac 61 G354 BQ790831 8.00E−45 Brassica rapa subsp. E4675Chinese cabbage etiol pekinensis 61 G354 AB006600 6.00E−27 Petunia xhybrida mRNA for ZPT2-13, complete cds. 61 G354 L46574 1.00E−26 Brassicarapa BNAF1975 Mustard flower buds Brassica rapa cD 61 G354 BM4371463.00E−24 Vitis vinifera VVA015A06_53787 An expressed sequence tag da 61G354 BQ121105 6.00E−24 Solanum tuberosum EST606681 mixed potato tissuesSolanum tu 61 G354 BM527789 2.00E−23 Glycine max sal65h07.y1 Gm-c10611Glycine max cDNA clone SOY 61 G354 AI898309 2.00E−23 LycopersiconEST267752 tomato ovary, esculentum TAMU Lycope 61 G354 BU867080 5.00E−22Populus tremula x Populus S074B01 Populus imbib tremuloides 61 G354BQ980246 1.00E−21 Lactuca sativa QGE10I12.yg.abl QG_EFGHJ lettuceserriola La 61 G354 gi2346976 5.60E−29 Petunia x hybrida ZPT2-13. 61G354 gi15623820 1.90E−22 Oryza sativa hypothetical protein. 61 G354gi21104613 4.00E−19 Oryza sativa contains ESTs AU07 (japonica cultivar-group) 61 G354 gi2981169 1.80E−17 Nicotiana tabacum osmoticstress-induced zinc- finger prot 61 G354 gi1763063 4.10E−16 Glycine maxSCOF-1. 61 G354 gi4666360 8.90E−15 Datisca glomerata zinc-fingerprotein 1. 61 G354 gi2058504 1.00E−14 Brassica rapa zinc-fingerprotein-1. 61 G354 gi7228329 4.90E−14 Medicago sativa putative TFIIIA(or kruppel)- like zinc fi 61 G354 gi485814 3.20E−13 Triticum aestivumWZF1. 61 G354 gi2129892 1.20E−06 Pisum sativum probable finger proteinPszf1 - garden pea. 87 G481 BU238020 9.00E−71 Descurainia sophiaDs01_14a12_A Ds01_AAFC_ECORC_cold_stress 87 G481 BG440251 2.00E−56Gossypium arboreum GA_Ea0006K20f Gossypium arboreum 7-10 d 87 G481BF071234 1.00E−54 Glycine max st06h05.y1 Gm-c1065 Glycine max cDNA cloneGENO 87 G481 BQ799965 2.00E−54 Vitis vinifera EST 2134 Green Grapeberries Lambda Zap II L 87 G481 BQ488908 5.00E−53 Beta vulgaris95-E9134-006-006-M23-T3 Sugar beet MPIZ-ADIS- 87 G481 BU499457 1.00E−52Zea mays 946175D02.y1 946 - tassel primordium prepared by S 87 G481AI728916 2.00E−52 Gossypium hirsutum BNLGHi12022 Six-day Cotton fiberGossypi 87 G481 BG642751 3.00E−52 Lycopersicon EST510945 tomatoesculentum shoot/meristem Lyc 87 G481 BQ857127 3.00E−51 Lactuca sativaQGB6K24.yg.abl QG_ABCDI lettuce salinas Lact 87 G481 BE413647 6.00E−51Triticum aestivum SCU001.E10.R990714 ITEC SCU Wheat Endospe 87 G481gi115840 1.90E−51 Zea mays CCAAT-BINDING TRANSCRIPTION FACTOR SUBUNIT A(CB 87 G481 gi20160792 2.60E−47 Oryza sativa putative CAAT-box (japonicacultivar- group) 87 G481 gi15408794 7.10E−38 Oryza sativa putativeCCAAT-binding transcription factor 89 G482 BQ505706 7.00E−59 Solanumtuberosum EST613121 Generation of a set of potato c 89 G482 AC1221656.00E−57 Medicago truncatula clone mth2-32m22, WORKING DRAFT SEQUENC 89G482 BQ104671 2.00E−55 Rosa hybrid cultivar fc0546.e Rose Petals(Fragrant Cloud) 89 G482 BI469382 4.00E−55 Glycine max sai11b10.y1Gm-c1053 Glycine max cDNA clone GEN 89 G482 AAAA01003638 1.00E−54 Oryzasativa (indica ( ) scaffold003638 cultivar-group) 89 G482 AP0051931.00E−54 Oryza sativa ( ) chromosome 7 clo (japonica cultivar- group) 89G482 BU880488 1.00E−53 Populus balsamifera UM49TG09 Populus flo subsp.trichocarpa 89 G482 BJ248969 2.00E−53 Triticum aestivum BJ248969 Y.Ogihara unpublished cDNA libr 89 G482 AC120529 4.00E−53 Oryza sativachromosome 3 clone OSJNBa0039N21, *** SEQUENCI 89 G482 BU896236 7.00E−53Populus tremula x Populus X037F04 Populus wood tremuloides 89 G482gi115840 1.40E−46 Zea mays CCAAT-BINDING TRANSCRIPTION FACTOR SUBUNIT A(CB 89 G482 gi20160792 2.30E−41 Oryza sativa putative CAAT-box (japonicacultivar- group) 93 G489 BH679015 1.00E−111 Brassica oleracea BOHXO96TFBO_2_3_KB Brassica oleracea gen 93 G489 AC136503 1.00E−75 Medicagotruncatula clone mth2-15n1, WORKING DRAFT SEQUENCE 93 G489 BQ1180338.00E−73 Solanum tuberosum EST603609 mixed potato tissues Solanum tu 93G489 BU873518 4.00E−68 Populus balsamifera Q056D09 Populus flow subsp.trichocarpa 93 G489 BI934205 2.00E−67 Lycopersicon EST554094 tomatoflower, esculentum anthesis L 93 G489 BQ797616 1.00E−66 Vitis viniferaEST 6554 Ripening Grape berries Lambda Zap I 93 G489 BM064398 4.00E−63Capsicum annuum KS01066E11 KS01 Capsicum annuum cDNA, mRNA 93 G489BU927107 4.00E−60 Glycine max sas95f12.y1 Gm-c1036 Glycine max cDNAclone SOY 93 G489 BQ993879 6.00E−59 Lactuca sativa QGF5L12.yg.ab1QG_EFGHJ lettuce serriola Lac 93 G489 AP004113 1.00E−58 Oryza sativachromosome 2 clone OJ1116_A06, *** SEQUENCING 93 G489 gi5257260 6.20E−46Oryza sativa Similar to sequence of BAC F7G19 from Arabid 93 G489gi20804442 6.60E−19 Oryza sativa hypothetical prote (japonica cultivar-group) 93 G489 gi18481626 3.90E−09 Zea mays repressor protein. 93 G489gi1808688 0.051 Sporobolus stapfianus hypothetical protein. 93 G489gi8096192 0.21 Lilium longiflorum gH2A.1. 93 G489 gi2130105 0.25Triticum aestivum histone H2A.4 - wheat. 93 G489 gi297871 0.27 Piceaabies histone H2A. 93 G489 gi297887 0.31 Daucus carota glycine richprotein. 93 G489 gi15214035 0.75 Cicer arietinum HISTONE H2A. 93 G489gi2317760 0.75 Pinus taeda H2A homolog. 127 G634 OSGT2 2.00E−47 Oryzasativa O. sativa gt-2 gene. 127 G634 BU049946 1.00E−46 Zea mays1111017E09.y1 1111 - Unigene III from Maize Genome 127 G634 AF3724996.00E−38 Glycine max GT-2 factor mRNA, partial cds. 127 G634 AB0527294.00E−37 Pisum sativum mRNA for DNA-binding protein DF1, complete cd 127G634 BU889446 4.00E−36 Populus tremula P021A05 Populus petioles cDNAlibrary Popul 127 G634 BH436958 2.00E−35 Brassica oleracea BOHBE67TFBOHB Brassica oleracea genomic 127 G634 AI777252 3.00E−35 LycopersiconEST258217 tomato resistant, esculentum Cornell 127 G634 AW6867541.00E−33 Medicago truncatula NF042C08NR1F1000 Nodulated root Medicag 127G634 AV410715 4.00E−33 Lotus japonicus AV410715 Lotus japonicus youngplants (two- 127 G634 AI730933 8.00E−30 Gossypium hirsutum BNLGHi8208Six-day Cotton fiber Gossypiu 127 G634 gi13786451 3.20E−78 Oryza sativaputative transcription factor. 127 G634 gi13646986 3.50E−66 Pisumsativum DNA-binding protein DF1. 127 G634 gi18182311 2.70E−38 Glycinemax GT-2 factor. 127 G634 gi20161567 8.90E−11 Oryza sativa hypotheticalprote (japonica cultivar- group) 127 G634 gi170271 4.70E−08 Nicotianatabacum DNA-binding protein. 127 G634 gi18349 0.0027 Daucus carotaglycine rich protein (AA 1-96). 127 G634 gi21388658 0.027 Physcomitrellapatens glycine-rich RNA binding protein. 127 G634 gi21322752 0.052Triticum aestivum cold shock protein-1. 127 G634 gi3126963 0.057Elaeagnus umbellata acidic chitinase. 127 G634 gi1166450 0.087Lycopersicon Tfm5. esculentum 147 G682 BU831849 8.00E−25 Populus tremulax Populus T026E01 Populus apica tremuloides 147 G682 BU872107 8.00E−25Populus balsamifera Q039C07 Populus flow subsp. trichocarpa 147 G682BM437313 1.00E−20 Vitis vinifera VVA017F06_54121 An expressed sequencetag da 147 G682 BI699876 4.00E−19 Glycine max sag49b09.y1 Gm-c1081Glycine max cDNA clone GEN 147 G682 BH961028 1.00E−16 Brassica oleraceaodj30d06.g1 B. oleracea002 Brassica olerac 147 G682 AL750151 2.00E−14Pinus pinaster AL750151 AS Pinus pinaster cDNA clone AS06C1 147 G682BJ476463 1.00E−13 Hordeum vulgare BJ476463 K. Sato subsp. vulgareunpublished 147 G682 AJ485557 1.00E−13 Hordeum vulgare AJ485557 S00011Hordeum vulgare cDNA clone 147 G682 CA762299 2.00E−13 Oryza sativa (indica BR060003B10F03.ab1 IRR cultivar-group) 147 G682 CA736777 2.00E−12Triticum aestivum wpi1s.pk008.n12 wpi1s Triticum aestivum c 147 G682gi23476287 8.30E−12 Gossypium hirsutum myb-like transcription factor 2.147 G682 gi23476291 8.30E−12 Gossypium raimondii myb-like transcriptionfactor 2. 147 G682 gi23476293 8.30E−12 Gossypium myb-like transcriptionfactor herbaceum 2. 147 G682 gi23476295 8.30E−12 Gossypioides kirkiimyb-like transcription factor 2. 147 G682 gi15042120 2.20E−11 Zealuxurians CI protein. 147 G682 gi19548449 2.20E−11 Zea mays P-type R2R3Myb protein. 147 G682 gi9954118 2.80E−11 Solanum tuberosumtuber-specific and sucrose- responsive e 147 G682 gi15042108 4.60E−11Zea mays subsp. CI protein. parviglumis 147 G682 gi15082210 1.50E−10Fragaria x ananassa transcription factor MYB1. 147 G682 gi222666691.50E−10 Vitis labrusca x Vitis myb-related transcription vinifera 167G864 BH472654 1.00E−105 Brassica oleracea BOHPF07TF BOHP Brassicaoleracea genomic 167 G864 AP004902 2.00E−44 Lotus japonicus genomic DNA,chromosome 2, clone: LjT04G24, 167 G864 BM886518 5.00E−40 Glycine maxsam17f08.y1 Gm-c1068 Glycine max cDNA clone SOY 167 G864 AW6855245.00E−39 Medicago truncatula NF031C12NR1F1000 Nodulated root Medicag 167G864 AP001800 6.00E−36 Oryza sativa genomic DNA, chromosome 1, PACclone: P0443E05. 167 G864 LEU89257 6.00E−32 Lycopersicon DNA-bindingprotein Pti6 esculentum mRNA, comp 167 G864 AAAA01000263 7.00E−31 Oryzasativa (indica ( ) scaffold000263 cultivar-group) 167 G864 BQ8737728.00E−30 Lactuca sativa QGI2I03.yg.ab1 QG_ABCDI lettuce salinas Lact 167G864 AF058827 7.00E−29 Nicotiana tabacum TSI1 (Tsi1) mRNA, complete cds.167 G864 BZ419846 3.00E−25 Zea mays if61a07.b1 WGS-ZmaysF (DH5a methylfiltered) Zea m 167 G864 gi8096469 1.60E−38 Oryza sativa Similar toArabidopsis thaliana chromosome 4 167 G864 gi2213785 1.00E−34Lycopersicon Pti6. esculentum 167 G864 gi23617235 3.70E−25 Oryza sativacontains ESTs AU16 (japonica cultivar- group) 167 G864 gi30658957.60E−25 Nicotiana tabacum TSI1. 167 G864 gi3264767 1.90E−21 Prunusarmeniaca AP2 domain containing protein. 167 G864 gi8571476 4.30E−21Atriplex hortensis apetala2 domain-containing protein. 167 G864gi17385636 2.80E−20 Matricaria ethylene-responsive element chamomillabinding 167 G864 gi8809571 4.50E−20 Nicotiana sylvestrisethylene-responsive element binding 167 G864 gi7528276 5.70E−20Mesembryanthemum AP2-related transcription f crystallinum 167 G864gi21908036 9.30E−20 Zea mays DRE binding factor 1. 169 G867 BQ9715112.00E−94 Helianthus annuus QHB7E05.yg.ab1 QH_ABCDI sunflower RHA801 169G867 AP003450 6.00E−85 Oryza sativa chromosome 1 clone P0034C09, ***SEQUENCING IN 169 G867 AC135925 1.00E−80 Oryza sativa ( ) chromosome 5clo (japonica cultivar- group) 169 G867 AAAA01000997 1.00E−79 Oryzasativa (indica ( ) scaffold000997 cultivar-group) 169 G867 BQ4056982.00E−77 Gossypium arboreum GA_Ed0085H02f Gossypium arboreum 7-10 d 169G867 BZ015521 4.00E−69 Brassica oleracea oeg86a05.g1 B. oleracea002Brassica olerac 169 G867 BF520598 2.00E−66 Medicago truncatula EST458071DSIL Medicago truncatula cDNA 169 G867 BU994579 4.00E−64 Hordeum vulgareHM07I08r HM Hordeum subsp. vulgare vulgare 169 G867 BF424857 2.00E−62Glycine max su59h03.y1 Gm-c1069 Glycine max cDNA clone GENO 169 G867BU871082 1.00E−61 Populus balsamifera Q026F06 Populus flow subsp.trichocarpa 169 G867 gi18565433 2.40E−85 Oryza sativa DNA-binding protei(japonica cultivar- group) 169 G867 gi12328560 2.90E−73 Oryza sativaputative DNA binding protein RAV2. 169 G867 gi10798644 7.30E−13Nicotiana tabacum AP2 domain-containing transcription fac 169 G867gi18266198 2.50E−10 Narcissus AP-2 domain containing pseudonarcissusprotein. 169 G867 gi20340233 2.50E−10 Thellungiella ethylene responsiveelement halophila bindi 169 G867 gi22074046 1.50E−09 Lycopersicontranscription factor JERF1. esculentum 169 G867 gi3264767 6.90E−09Prunus armeniaca AP2 domain containing protein. 169 G867 gi184960637.10E−09 Fagus sylvatica ethylene responsive element binding prote 169G867 gi13173164 8.30E−09 Pisum sativum APETAL2-like protein. 169 G867gi1730475 8.70E−09 Hordeum vulgare viviparous-1. 185 G912 BH4986622.00E−93 Brassica oleracea BOGTO66TR BOGT Brassica oleracea genomic 185G912 AF084185 2.00E−75 Brassica napus dehydration responsive elementbinding prote 185 G912 AF211531 1.00E−59 Nicotiana tabacum Avr9/Cf-9rapidly elicited protein 111B 185 G912 AY034473 1.00E−55 Lycopersiconputative transcriptional esculentum activator 185 G912 BG321601 4.00E−53Descurainia sophia Ds01_01h03_R Ds01_AAFC_ECORC_cold_stress 185 G912AB080965 9.00E−53 Prunus avium DREB1-like gene for dehydratiionresponsive el 185 G912 BG590659 4.00E−51 Solanum tuberosum EST498501 P.infestans- challenged leaf So 185 G912 BG644969 1.00E−50 Medicagotruncatula EST506588 KV3 Medicago truncatula cDNA 185 G912 BU0167832.00E−49 Helianthus annuus QHE14A02.yg.ab1 QH_EFGHJ sunflower RHA280 185G912 BU871514 1.00E−47 Populus balsamifera Q031D09 Populus flow subsp.trichocarpa 185 G912 gi5616086 5.90E−73 Brassica napus dehydrationresponsive element binding pro 185 G912 gi12003384 5.20E−58 Nicotianatabacum Avr9/Cf-9 rapidly elicited protein 111B 185 G912 gi234954583.90E−53 Prunus avium dehydratiion responsive element binding prot 185G912 gi18535580 2.00E−49 Lycopersicon putative transcriptionalesculentum activato 185 G912 gi19071243 1.30E−45 Hordeum vulgare CRT/DREbinding factor 1. 185 G912 gi24474328 8.20E−44 Oryza sativa apetala2domain-co (japonica cultivar- group) 185 G912 gi6983877 9.00E−38 Oryzasativa Similar to mRNA for DREB1A (AB007787). 185 G912 gi171486513.90E−35 Secale cereale CBF-like protein. 185 G912 gi20152903 1.40E−32Hordeum vulgare CRT/DRE binding factor 2. subsp. vulgare 185 G912gi17226801 2.10E−31 Triticum aestivum putative CRT/DRE-binding factor.187 G913 AI352878 4.00E−87 Brassica napus MB72-11D PZ204.BNlib Brassicanapus cDNA clo 187 G913 BH536782 1.00E−59 Brassica oleracea BOGCX29TRBOGC Brassica oleracea genomic 187 G913 AW033835 2.00E−46 LycopersiconEST277406 tomato callus, esculentum TAMU Lycop 187 G913 BQ4111661.00E−43 Gossypium arboreum GA_Ed0037B05f Gossypium arboreum 7-10 d 187G913 BQ165313 5.00E−43 Medicago truncatula EST611182 KVKC Medicagotruncatula cDNA 187 G913 AP006060 5.00E−43 Oryza sativa ( ) chromosome 2clo (japonica cultivar- group) 187 G913 AAAA01000810 2.00E−42 Oryzasativa (indica ( ) scaffold000810 cultivar-group) 187 G913 OSJN001282.00E−38 Oryza sativa chromosome 4 clone OSJNBA0088I22, *** SEQUENC 187G913 BQ976989 3.00E−31 Helianthus annuus QHI23I22.yg.ab1 QH_ABCDIsunflower RHA801 187 G913 BQ592028 6.00E−30 Beta vulgarisE012695-024-021-K17-SP6 MPIZ-ADIS-024-develop 187 G913 gi141401551.60E−32 Oryza sativa putative AP2 domain transcription factor. 187 G913gi12003382 1.40E−30 Nicotiana tabacum Avr9/Cf-9 rapidly elicited protein111A 187 G913 gi20303570 1.40E−30 Oryza sativa putative transcrip(japonica cultivar- group) 187 G913 gi18535580 3.80E−30 Lycopersiconputative transcriptional esculentum activato 187 G913 gi234954604.40E−29 Prunus avium dehydration responsive element binding prote 187G913 gi5616086 6.50E−28 Brassica napus dehydration responsive elementbinding pro 187 G913 gi21908034 1.40E−25 Zea mays DRE binding factor 2.187 G913 gi19071243 1.20E−21 Hordeum vulgare CRT/DRE binding factor 1.187 G913 gi17148649 2.30E−17 Secale cereale CBF-like protein. 187 G913gi8571476 2.30E−17 Atriplex hortensis apetala2 domain-containingprotein. 189 G922 AP004485 1.0e−999 Lotus japonicus genomic DNA,chromosome 2, clone: LjT08D14, 189 G922 AP003259 1.00E−130 Oryza sativachromosome 1 clone P0466H10, *** SEQUENCING IN 189 G922 AAAA010003741.00E−130 Oryza sativa ( indica ( ) scaffold000374 cultivar-group) 189G922 BH493536 1.00E−121 Brassica oleracea BOGXB10TR BOGX Brassicaoleracea genomic 189 G922 CNS08CCP 1.00E−92 Oryza sativa ( ) chromosome12 cl (japonica cultivar- group) 189 G922 BG643567 6.00E−82 LycopersiconEST511761 tomato esculentum shoot/meristem Lyc 189 G922 BQ1248982.00E−81 Medicago truncatula EST610474 GLSD Medicago truncatula cDNA 189G922 BU764181 2.00E−71 Glycine max sas53f07.y1 Gm-c1023 Glycine max cDNAclone SOY 189 G922 BG595716 3.00E−62 Solanum tuberosum EST494394 cSTSSolanum tuberosum cDNA clo 189 G922 AF378125 6.00E−55 Vitis viniferaGAI-like protein 1 (GAI1) gene, complete cds 189 G922 gi228309256.30E−127 Oryza sativa putative gibberell (japonica cultivar- group) 189G922 gi13365610 3.00E−57 Pisum sativum SCARECROW. 189 G922 gi131701265.20E−55 Brassica napus unnamed protein product. 189 G922 gi101786376.30E−51 Zea mays SCARECROW. 189 G922 gi13937306 2.30E−50 Oryza sativagibberellin-insensitive protein OsGAI. 189 G922 gi18254373 9.20E−50Hordeum vulgare nuclear transcription factor SLN1. 189 G922 gi56401572.60E−49 Triticum aestivum gibberellin response modulator. 189 G922gi20257451 3.10E−49 Calycadenia GIA/RGA-like gibberellin multiglandulosaresp 189 G922 gi13620224 1.30E−46 Lycopersicon lateral suppressor.esculentum 189 G922 gi13620166 2.20E−41 Capsella rubella hypotheticalprotein. 191 G926 BU573158 1.00E−56 Prunus dulcis PA_Ea0003A12f Almonddeveloping seed Prunus 191 G926 BI310587 2.00E−55 Medicago truncatulaEST5312337 GESD Medicago truncatula cDN 191 G926 BQ624240 1.00E−47Citrus sinensis USDA-FP_01331 Ridge pineapple sweet orange 191 G926BH443554 3.00E−44 Brassica oleracea BOHGN12TR BOHG Brassica oleraceagenomic 191 G926 BNU33884 2.00E−39 Brassica napus clone bncbf-b1 CCAAT-binding factor B subuni 191 G926 BF113081 8.00E−38 LycopersiconEST440591 tomato breaker esculentum fruit Lyco 191 G926 BG8864942.00E−36 Solanum tuberosum EST512345 cSTD Solanum tuberosum cDNA clo 191G926 AW472517 3.00E−36 Glycine max si26c12.y1 Gm-r1030 Glycine max cDNAclone GENO 191 G926 BQ407583 6.00E−36 Gossypium arboreum GA_Ed0108F07fGossypium arboreum 7-10 d 191 G926 BG343051 7.00E−34 Hordeum vulgareHVSMEg0001N16f Hordeum vulgare pre- anthesis 191 G926 gi1173616 9.70E−41Brassica napus CCAAT-binding factor B subunit homolog. 191 G926gi2826786 1.10E−27 Oryza sativa RAPB protein. 191 G926 gi71412435.80E−27 Vitis riparia transcription factor. 191 G926 gi4731314 4.00E−19Nicotiana tabacum CCAAT-binding transcription factor subu 191 G926gi2104675 0.0061 Vicia faba transcription factor. 191 G926 gi216674710.64 Hordeum vulgare CONSTANS-like protein. 191 G926 gi13775107 0.67Phaseolus vulgaris bZIP transcription factor 2. 191 G926 gi1096930 0.69Solanum tuberosum H ATPase inhibitor. 191 G926 gi24413952 0.72 Oryzasativa putative iron supe (japonica cultivar- group) 191 G926 gi18395930.78 Zea mays heat shock protein 70 homolog {clone CHEM 3} [Ze 199 G975BH477624 1.00E−69 Brassica oleracea BOGNB10TF BOGN Brassica oleraceagenomic 199 G975 CA486875 3.00E−64 Triticum aestivum WHE4337_A02_A03ZSWheat meiotic anther cD 199 G975 BI978981 2.00E−60 Rosa chinensis zD09Old Blush petal SMART library Rosa chin 199 G975 AP004869 9.00E−60 Oryzasativa ( ) chromosome 2 clo (japonica cultivar- group) 199 G975 BU9784901.00E−58 Hordeum vulgare HA13G05r HA Hordeum subsp. vulgare vulgare 199G975 BG642554 8.00E−57 Lycopersicon EST356031 tomato flower esculentumbuds, anthe 199 G975 BI958226 2.00E−54 Hordeum vulgare HVSMEn0013P17fHordeum vulgare rachis EST 1 199 G975 BQ104740 1.00E−52 Rosa hybridcultivar fc0212.e Rose Petals (Fragrant Cloud) 199 G975 AW7059733.00E−51 Glycine max sk64c02.y1 Gm-c1016 Glycine max cDNA clone GENO 199G975 AP003615 1.00E−47 Oryza sativa chromosome 6 clone P0486H12, ***SEQUENCING IN 199 G975 gi18650662 1.80E−25 Lycopersicon ethyleneresponse factor 1. esculentum 199 G975 gi131754 2.10E−22 Lupinuspolyphyllus PPLZ02 PROTEIN. 199 G975 gi3065895 9.20E−20 Nicotianatabacum TSI1. 199 G975 gi8571476 9.30E−20 Atriplex hortensis apetala2domain-containing protein. 199 G975 gi19920190 1.90E−19 Oryza sativaPutative AP2 domai (japonica cultivar- group) 199 G975 gi219080368.40E−19 Zea mays DRE binding factor 1. 199 G975 gi4099914 1.10E−18Stylosanthes hamata ethylene-responsive element binding p 199 G975gi10567106 1.60E−18 Oryza sativa osERF3. 199 G975 gi8809573 9.60E−18Nicotiana sylvestris ethylene-responsive element binding 199 G975gi7528276 1.20E−17 Mesembryanthemum AP2-related transcription fcrystallinum 221 G1069 BZ025139 1.00E−111 Brassica oleracea oeh63d12.g1B. oleracea002 Brassica olerac 221 G1069 AP004971 1.00E−93 Lotusjaponicus genomic DNA, chromosome 5, clone: LjT45G21, 221 G1069 AP0040202.00E−79 Oryza sativa chromosome 2 clone OJ1119_A01, *** SEQUENCING 221G1069 AAAA01017331 2.00E−70 Oryza sativa ( indica ( ) scaffold017331cultivar-group) 221 G1069 BQ165495 2.00E−62 Medicago truncatulaEST611364 KVKC Medicago truncatula cDNA 221 G1069 AC135209 2.00E−61Oryza sativa ( ) chromosome 3 clo (japonica cultivar- group) 221 G1069AW621455 4.00E−59 Lycopersicon EST312253 tomato root esculentumduring/after 221 G1069 BM110212 4.00E−58 Solanum tuberosum EST557748potato roots Solanum tuberosum 221 G1069 BQ785950 7.00E−58 Glycine maxsaq61f09.y1 Gm-c1076 Glycine max cDNA clone SOY 221 G1069 BQ8632491.00E−57 Lactuca sativa QGC23G02.yg.ab1 QG_ABCDI lettuce salinas Lac 221G1069 gi24059979 2.10E−38 Oryza sativa similar to DNA-bin (japonicacultivar- group) 221 G1069 gi15528814 4.50E−36 Oryza sativa hypotheticalprotein~similar to Arabidopsis 221 G1069 gi4165183 7.60E−25 Antirrhinummajus SAP1 protein. 221 G1069 gi2213534 1.20E−19 Pisum sativumDNA-binding PD1-like protein. 221 G1069 gi2459999 1 Chlamydomonastubulin Uni3. reinhardtii 221 G1069 gi100872 1 Zea mays MFS18 protein -maize. 221 G1069 gi1362165 1 Hordeum vulgare hypothetical protein 2(clone ES1A) - bar 223 G1073 AAAA01000486 4.00E−74 Oryza sativa ( indica( ) scaffold000486 cultivar-group) 223 G1073 AP004165 4.00E−74 Oryzasativa chromosome 2 clone OJ1479_B12, *** SEQUENCING 223 G1073 AP0054772.00E−67 Oryza sativa ( ) chromosome 6 clo (japonica cultivar- group)223 G1073 BZ412041 3.00E−65 Zea mays OGACG56TC ZM_0.7_1.5_KB Zea maysgenomic clone ZMM 223 G1073 AJ502190 3.00E−64 Medicago truncatulaAJ502190 MTAMP Medicago truncatula cDNA 223 G1073 BQ865858 4.00E−63Lactuca sativa QGC6B08.yg.ab1 QG_ABCDI lettuce salinas Lact 223 G1073BH975957 5.00E−63 Brassica oleracea odh67e11.g1 B. oleracea002 Brassicaolerac 223 G1073 BG134451 8.00E−62 Lycopersicon EST467343 tomato crownesculentum gall Lycoper 223 G1073 AP004971 3.00E−60 Lotus japonicusgenomic DNA, chromosome 5, clone: LjT45G21, 223 G1073 BM110212 7.00E−58Solanum tuberosum EST557748 potato roots Solanum tuberosum 223 G1073gi15528814 5.50E−38 Oryza sativa hypothetical protein~similar toArabidopsis 223 G1073 gi24059979 1.30E−29 Oryza sativa similar toDNA-bin (japonica cultivar- group) 223 G1073 gi2213536 1.20E−21 Pisumsativum DNA-binding protein PD1. 223 G1073 gi4165183 5.70E−20Antirrhinum majus SAP1 protein. 223 G1073 gi1166450 0.00059 LycopersiconTfm5. esculentum 223 G1073 gi11545668 0.0051 Chlamydomonas CIA5.reinhardtii 223 G1073 gi4755087 0.0054 Zea mays aluminum-inducedprotein; Al-induced protein. 223 G1073 gi395147 0.0068 Nicotiana tabacumglycine-rich protein. 223 G1073 gi21068672 0.017 Cicer arietinumputative glicine-rich protein. 223 G1073 gi1346181 0.017 Sinapis albaGLYCINE-RICH RNA- BINDING PROTEIN GRP2A. 225 G1075 BH596283 1.00E−108Brassica oleracea BOGBL42TR BOGB Brassica oleracea genomic 225 G1075BQ165495 5.00E−88 Medicago truncatula EST611364 KVKC Medicago truncatulacDNA 225 G1075 AAAA01003389 3.00E−84 Oryza sativa ( indica ( )scaffold003389 cultivar-group) 225 G1075 OSJN00182 3.00E−84 Oryza sativachromosome 4 clone OSJNBa0086O06, *** SEQUENC 225 G1075 BZ4120411.00E−76 Zea mays OGACG56TC ZM_0.7_1.5_KB Zea mays genomic clone ZMM 225G1075 AP005653 1.00E−68 Oryza sativa ( ) chromosome 2 clo (japonicacultivar- group) 225 G1075 BQ863249 3.00E−65 Lactuca sativaQGC23G02.yg.ab1 QG_ABCDI lettuce salinas Lac 225 G1075 BM110212 2.00E−63Solanum tuberosum EST557748 potato roots Solanum tuberosum 225 G1075BQ838600 8.00E−63 Triticum aestivum WHE2912_D12_H24ZS Wheataluminum-stressed 225 G1075 AP004971 4.00E−62 Lotus japonicus genomicDNA, chromosome 5, clone: LjT45G21, 225 G1075 gi15528814 3.80E−39 Oryzasativa hypothetical protein~similar to Arabidopsis 225 G1075 gi240599796.60E−35 Oryza sativa similar to DNA-bin (japonica cultivar- group) 225G1075 gi4165183 7.30E−20 Antirrhinum majus SAP1 protein. 225 G1075gi2213534 2.50E−19 Pisum sativum DNA-binding PD1-like protein. 225 G1075gi3810890 3.70E−05 Cucumis sativus glycine-rich protein-2. 225 G1075gi7489009 0.0001 Lycopersicon glycine-rich protein (clone esculentumw10-1 225 G1075 gi4115615 0.0018 Zea mays root cap-specific glycine-richprotein. 225 G1075 gi1628463 0.004 Silene latifolia Men-4. 225 G1075gi395147 0.005 Nicotiana tabacum glycine-rich protein. 225 G1075gi121631 0.0056 Nicotiana sylvestris GLYCINE- RICH CELL WALL STRUCTURALPR 269 G1411 BZ017225 3.00E−51 Brassica oleracea oei67e03.b1 B.oleracea002 Brassica olerac 269 G1411 BQ138607 8.00E−44 Medicagotruncatula NF005C01PH1F1004 Phoma-infected Medicag 269 G1411 BQ7867025.00E−36 Glycine max saq72b07.y1 Gm-c1076 Glycine max cDNA clone SOY 269G1411 BM062508 7.00E−32 Capsicum annuum KS01043F09 KS01 Capsicum annuumcDNA, mRNA 269 G1411 AAAA01000832 2.00E−30 Oryza sativa ( indica ( )scaffold000832 cultivar-group) 269 G1411 OSJN00240 2.00E−30 Oryza sativagenomic DNA, chromosome 4, BAC clone: OSJNBa0 269 G1411 BE4194512.00E−29 Triticum aestivum WWS012.C2R000101 ITEC WWS Wheat Scutellum 269G1411 CA014817 6.00E−29 Hordeum vulgare HT12H01r HT Hordeum subsp.vulgare vulgare 269 G1411 BE642320 1.00E−28 Ceratopteris richardiiCri2_5_L17_SP6 Ceratopteris Spore Li 269 G1411 BE494041 2.00E−27 Secalecereale WHE1277_B09_D17ZS Secale cereale anther cDNA 269 G1411gi20160854 1.40E−29 Oryza sativa hypothetical prote (japonica cultivar-group) 269 G1411 gi14140141 1.50E−24 Oryza sativa putative AP2-relatedtranscription factor. 269 G1411 gi3342211 1.40E−23 Lycopersicon Pti4.esculentum 269 G1411 gi10798644 2.30E−23 Nicotiana tabacum AP2domain-containing transcription fac 269 G1411 gi8809571 2.30E−23Nicotiana sylvestris ethylene-responsive element binding 269 G1411gi24817250 3.00E−23 Cicer arietinum transcription factor EREBP- likeprotein. 269 G1411 gi3264767 3.00E−23 Prunus armeniaca AP2 domaincontaining protein. 269 G1411 gi1688233 3.80E−23 Solanum tuberosum DNAbinding protein homolog. 269 G1411 gi7528276 3.80E−23 MesembryanthemumAP2-related transcription f crystallinum 269 G1411 gi21304712 6.20E−23Glycine max ethylene-responsive element binding protein 1 277 G1451AB071298 1.0e−999 Oryza sativa OsARF8 mRNA for auxin response factor 8,parti 277 G1451 AY105215 1.00E−157 Zea mays PCO121637 mRNA sequence. 277G1451 AW690130 1.00E−109 Medicago truncatula NF028B12ST1F1000 Developingstem Medica 277 G1451 BQ862285 1.00E−108 Lactuca sativa QGC20K23.yg.ab1QG_ABCDI lettuce salinas Lac 277 G1451 BG597435 1.00E−107 Solanumtuberosum EST496113 cSTS Solanum tuberosum cDNA clo 277 G1451 BJ3036021.00E−104 Triticum aestivum BJ303602 Y. Ogihara unpublished cDNA libr277 G1451 OSA306306 1.00E−103 Oryza sativa Oryza sativa subsp. (japonicacultivar- group) 277 G1451 BQ595269 1.00E−89 Beta vulgarisE012710-024-023-D13-SP6 MPIZ-ADIS-024-develop 277 G1451 CA8012181.00E−86 Glycine max sau02f06.y2 Gm-c1062 Glycine max cDNA clone SOY 277G1451 BG159611 8.00E−79 Sorghum bicolor OV2_6_G07.b1_A002 Ovary 2 (OV2)Sorghum bic 277 G1451 gi19352049 3.70E−247 Oryza sativa auxin responsefactor 8. 277 G1451 gi20805236 3.10E−126 Oryza sativa auxin response fac(japonica cultivar- group) 277 G1451 gi24785191 4.10E−55 Nicotianatabacum hypothetical protein. 277 G1451 gi23343944 2.40E−28 Mirabilisjalapa auxin-responsive factor protein. 277 G1451 gi20269053 7.00E−10Populus tremula x Populus aux/IAA protein. tremuloides 277 G1451gi287566 3.10E−06 Vigna radiata ORF. 277 G1451 gi114733 1.10E−05 Glycinemax AUXIN-INDUCED PROTEIN AUX22. 277 G1451 gi871511 2.40E−05 Pisumsativum auxin-induced protein. 277 G1451 gi18697008 0.00027 Zea maysunnamed protein product. 277 G1451 gi17976835 0.00068 Pinus pinasterputative auxin induced transcription facto 303 G1543 AF145727 4.00E−51Oryza sativa homeodomain leucine zipper protein (hox3) mRNA 303 G1543CA030381 6.00E−41 Hordeum vulgare HX06O07r HX Hordeum subsp. vulgarevulgare 303 G1543 BQ741095 6.00E−39 Glycine max saq14c10.y1 Gm-c1045Glycine max cDNA clone SOY 303 G1543 AT002118 1.00E−38 Brassica rapasubsp. AT002118 Flower bud pekinensis cDNA Br 303 G1543 BQ8572262.00E−37 Lactuca sativa QGB6P03.yg.ab1 QG_ABCDI lettuce salinas Lact 303G1543 AB028075 4.00E−37 Physcomitrella patens mRNA for homeobox proteinPpHB4, comp 303 G1543 PBPHZ4GEN 4.00E−37 Pimpinella P. brachycarpa mRNAfor brachycarpa homeobox-leu 303 G1543 LEHDZIPP 5.00E−37 Lycopersicon L.esculentum mRNA for esculentum HD-ZIP protei 303 G1543 AF443619 1.00E−36Craterostigma homeodomain leucine zipper plantagineum prote 303 G1543AJ498394 2.00E−36 Medicago truncatula AJ498394 MTPOSE Medicagotruncatula cDN 303 G1543 gi5006851 8.30E−51 Oryza sativa homeodomainleucine zipper protein. 303 G1543 gi20161555 1.70E−50 Oryza sativaputative homeodoma (japonica cultivar- group) 303 G1543 gi180344371.60E−38 Craterostigma homeodomain leucine zipper plantagineum pro 303G1543 gi1149535 4.30E−38 Pimpinella homeobox-leucine zipper brachycarpaprotein. 303 G1543 gi992598 1.20E−37 Lycopersicon HD-ZIP protein.esculentum 303 G1543 gi7415620 1.50E−37 Physcomitrella patens homeoboxprotein PpHB4. 303 G1543 gi1234900 3.10E−37 Glycine max homeobox-leucinezipper protein. 303 G1543 gi3868847 1.90E−35 Ceratopteris richardiiCRHB10. 303 G1543 gi8919876 1.90E−35 Capsella rubella hypotheticalprotein. 303 G1543 gi1032372 3.20E−35 Helianthus annuus homeodomainprotein. 331 G1792 AI776626 5.00E−35 Lycopersicon EST257726 tomatoresistant, esculentum Cornell 331 G1792 BQ045702 1.00E−32 Solanumtuberosum EST594820 P. infestans- challenged potato 331 G1792 BM1788757.00E−32 Glycine max saj60f01.y1 Gm-c1072 Glycine max cDNA clone SOY 331G1792 BF649790 1.00E−31 Medicago truncatula NF084C07EC1F1052 Elicitedcell culture 331 G1792 BZ020356 1.00E−30 Brassica oleracea oeg04a10.g1B. oleracea002 Brassica olerac 331 G1792 BZ337899 3.00E−30 Sorghumbicolor ia91f11.b1 WGS-SbicolorF (JM107 adapted met 331 G1792 AC0259073.00E−30 Oryza sativa chromosome 10 clone nbxb0094K20, *** SEQUENCIN 331G1792 AAAA01002491 3.00E−30 Oryza sativa ( indica ( ) scaffold002491cultivar-group) 331 G1792 BZ359367 8.00E−30 Zea mays id72f11.b1WGS-ZmaysF (JM107 adapted methyl filter 331 G1792 AC137635 2.00E−27Oryza sativa Genomic sequence for (japonica cultivar- group) 331 G1792gi23452024 4.00E−26 Lycopersicon transcription factor TSRF1. esculentum331 G1792 gi1732406 2.10E−25 Nicotiana tabacum S25-XP1 DNA bindingprotein. 331 G1792 gi12597874 3.70E−25 Oryza sativa putativeethylene-responsive element binding 331 G1792 gi7528276 7.60E−25Mesembryanthemum AP2-related transcription f crystallinum 331 G1792gi24060081 1.30E−23 Oryza sativa putative ethylene (japonica cultivar-group) 331 G1792 gi8980313 1.80E−23 Catharanthus roseus AP2-domainDNA-binding protein. 331 G1792 gi8809571 1.80E−23 Nicotiana sylvestrisethylene-responsive element binding 331 G1792 gi17385636 1.20E−21Matricaria ethylene-responsive element chamomilla binding 331 G1792gi21304712 3.10E−21 Glycine max ethylene-responsive element bindingprotein 1 331 G1792 gi8571476 1.10E−20 Atriplex hortensis apetala2domain-containing protein. 341 G1820 AW776719 1.00E−43 Medicagotruncatula EST335784 DSIL Medicago truncatula cDNA 341 G1820 BM0655443.00E−40 Capsicum annuum KS07004F12 KS07 Capsicum annuum cDNA, mRNA 341G1820 BG591677 4.00E−40 Solanum tuberosum EST499519 P. infestans-challenged leaf So 341 G1820 BI701620 1.00E−38 Glycine max sai18a04.y1Gm-c1053 Glycine max cDNA clone GEN 341 G1820 BQ411597 3.00E−37Gossypium arboreum GA_Ed0041B06f Gossypium arboreum 7-10 d 341 G1820BE208917 6.00E−37 Citrus x paradisi GF-FV-P3F5 Marsh grapefruit youngflavedo 341 G1820 BH725354 1.00E−36 Brassica oleracea BOHVO37TFBO_2_3_KB Brassica oleracea gen 341 G1820 AW093662 9.00E−36 LycopersiconEST286842 tomato mixed esculentum elicitor, BT 341 G1820 BU8193464.00E−35 Populus tremula UA42BPF01 Populus tremula cambium cDNA libr 341G1820 AAAA01002977 3.00E−34 Oryza sativa ( indica ( ) scaffold002977cultivar-group) 341 G1820 gi5257260 1.40E−34 Oryza sativa Similar tosequence of BAC F7G19 from Arabid 341 G1820 gi20804442 1.70E−15 Oryzasativa hypothetical prote (japonica cultivar- group) 341 G1820gi18481626 6.30E−08 Zea mays repressor protein. 341 G1820 gi297871 0.39Picea abies histone H2A. 341 G1820 gi297887 0.41 Daucus carota glycinerich protein. 341 G1820 gi2130105 0.54 Triticum aestivum histone H2A.4 -wheat. 341 G1820 gi6782438 0.74 Nicotiana glauca glycine-rich protein.341 G1820 gi15214035 0.98 Cicer arietinum HISTONE H2A. 341 G1820gi2317760 0.98 Pinus taeda H2A homolog. 341 G1820 gi1173628 0.99Phalaenopsis sp. glycine-rich protein. SM9108 343 G1836 BI7016207.00E−35 Glycine max sai18a04.y1 Gm-c1053 Glycine max cDNA clone GEN 343G1836 AW776719 2.00E−33 Medicago truncatula EST335784 DSIL Medicagotruncatula cDNA 343 G1836 BQ411597 2.00E−33 Gossypium arboreumGA_Ed0041B06f Gossypium arboreum 7-10 d 343 G1836 BM065544 2.00E−32Capsicum annuum KS07004F12 KS07 Capsicum annuum cDNA, mRNA 343 G1836BG591677 3.00E−31 Solanum tuberosum EST499519 P. infestans- challengedleaf So 343 G1836 BU819346 6.00E−31 Populus tremula UA42BPF01 Populustremula cambium cDNA libr 343 G1836 BH725354 4.00E−30 Brassica oleraceaBOHVO37TF BO_2_3_KB Brassica oleracea gen 343 G1836 BE208917 6.00E−30Citrus x paradisi GF-FV-P3F5 Marsh grapefruit young flavedo 343 G1836AAAA01024926 5.00E−29 Oryza sativa ( indica ( ) scaffold024926cultivar-group) 343 G1836 AW093662 9.00E−29 Lycopersicon EST286842tomato mixed esculentum elicitor, BT 343 G1836 gi5257260 2.10E−29 Oryzasativa Similar to sequence of BAC F7G19 from Arabid 343 G1836 gi208044426.30E−16 Oryza sativa hypothetical prote (japonica cultivar- group) 343G1836 gi18481626 2.00E−06 Zea mays repressor protein. 343 G1836gi18539425 0.84 Pinus sylvestris putative malate dehydrogenase. 343G1836 gi122084 1 Hordeum vulgare Histone H3. 343 G1836 gi225348 1Hordeum vulgare histone H3. subsp. vulgare 369 G1930 BU025988 5.00E−88Helianthus annuus QHG12J17.yg.ab1 QH_EFGHJ sunflower RHA280 369 G1930AP003450 8.00E−80 Oryza sativa chromosome 1 clone P0034C09, ***SEQUENCING IN 369 G1930 AC135925 7.00E−79 Oryza sativa ( ) chromosome 5clo (japonica cultivar- group) 369 G1930 AAAA01000997 3.00E−78 Oryzasativa ( indica ( ) scaffold000997 cultivar-group) 369 G1930 BU9945791.00E−65 Hordeum vulgare HM07I08r HM Hordeum subsp. vulgare vulgare 369G1930 BQ405698 1.00E−65 Gossypium arboreum GA_Ed0085H02f Gossypiumarboreum 7-10 d 369 G1930 BF520598 1.00E−64 Medicago truncatulaEST458071 DSIL Medicago truncatula cDNA 369 G1930 BZ015521 1.00E−64Brassica oleracea oeg86a05.g1 B. oleracea002 Brassica olerac 369 G1930BF424857 2.00E−58 Glycine max su59h03.y1 Gm-c1069 Glycine max cDNA cloneGENO 369 G1930 BU870896 1.00E−56 Populus balsamifera Q019F06 Populusflow subsp. trichocarpa 369 G1930 gi18565433 4.10E−74 Oryza sativaDNA-binding protei (japonica cultivar- group) 369 G1930 gi123285601.80E−71 Oryza sativa putative DNA binding protein RAV2. 369 G1930gi10798644 1.40E−13 Nicotiana tabacum AP2 domain-containingtranscription fac 369 G1930 gi20340233 5.10E−11 Thellungiella ethyleneresponsive element halophila bindi 369 G1930 gi4099921 1.30E−10Stylosanthes hamata EREBP-3 homolog. 369 G1930 gi18496063 1.60E−10 Fagussylvatica ethylene responsive element binding prote 369 G1930 gi220740462.10E−10 Lycopersicon transcription factor JERF1. esculentum 369 G1930gi3264767 2.30E−10 Prunus armeniaca AP2 domain containing protein. 369G1930 gi18266198 1.10E−09 Narcissus AP-2 domain containingpseudonarcissus protein. 369 G1930 gi24940524 1.10E−09 Triticum aestivumethylene response element binding prote 407 G2133 BH420519 1.00E−53Brassica oleracea BOGUH88TF BOGU Brassica oleracea genomic 407 G2133BG543936 6.00E−43 Brassica rapa subsp. E1686 Chinese cabbage etiolpekinesis 407 G2133 AU292603 2.00E−28 Zinnia elegans AU292603 zinniacultured mesophyll cell equa 407 G2133 BE320193 6.00E−24 Medicagotruncatula NF024B04RT1F1029 Developing root Medica 407 G2133 AP0033463.00E−22 Oryza sativa chromosome 1 clone P0434C04, *** SEQUENCING IN 407G2133 AAAA01000718 3.00E−22 Oryza sativa ( indica ( ) scaffold000718cultivar-group) 407 G2133 AC124836 6.00E−22 Oryza sativa ( ) chromosome5 clo (japonica cultivar- group) 407 G2133 BZ403609 2.00E−20 Zea maysOGABN17TM ZM_0.7_1.5_KB Zea mays genomic clone ZMM 407 G2133 BM9854846.00E−19 Thellungiella 10_C12_T Ath Thellungiella halophila halophil 407G2133 BM403179 3.00E−17 Selaginella SLA012F10_35741 An lepidophyllaexpressed seque 407 G2133 gi20161239 6.90E−24 Oryza sativa hypotheticalprote (japonica cultivar- group) 407 G2133 gi8571476 6.00E−17 Atriplexhortensis apetala2 domain-containing protein. 407 G2133 gi141401557.80E−16 Oryza sativa putative AP2 domain transcription factor. 407G2133 gi5616086 7.00E−15 Brassica napus dehydration responsive elementbinding pro 407 G2133 gi21908034 8.90E−15 Zea mays DRE binding factor 2.407 G2133 gi19071243 6.30E−14 Hordeum vulgare CRT/DRE binding factor 1.407 G2133 gi18535580 2.10E−13 Lycopersicon putative transcriptionalesculentum activato 407 G2133 gi1208496 3.30E−13 Nicotiana tabacumEREBP-3. 407 G2133 gi8980313 4.40E−13 Catharanthus roseus AP2-domainDNA-binding protein. 407 G2133 gi15488459 2.20E−12 Triticum aestivumAP2-containing protein. 417 G2153 BH566718 1.00E−127 Brassica oleraceaBOHCV23TR BOHC Brassica oleracea genomic 417 G2153 AP004971 2.00E−90Lotus japonicus genomic DNA, chromosome 5, clone: LjT45G21, 417 G2153AP004020 1.00E−79 Oryza sativa chromosome 2 clone OJ1119_A01, ***SEQUENCING 417 G2153 AAAA01017331 2.00E−72 Oryza sativa ( indica ( )scaffold017331 cultivar-group) 417 G2153 BQ165495 2.00E−67 Medicagotruncatula EST611364 KVKC Medicago truncatula cDNA 417 G2153 AP0056531.00E−66 Oryza sativa ( ) chromosome 2 clo (japonica cultivar- group)417 G2153 BQ785950 8.00E−64 Glycine max saq61f09.y1 Gm-c1076 Glycine maxcDNA clone SOY 417 G2153 BZ412041 3.00E−63 Zea mays OGACG56TCZM_0.7_1.5_KB Zea mays genomic clone ZMM 417 G2153 BM110212 3.00E−63Solanum tuberosum EST557748 potato roots Solanum tuberosum 417 G2153BQ865858 7.00E−63 Lactuca sativa QGC6B08.yg.ab1 QG_ABCDI lettuce salinasLact 417 G2153 gi24059979 3.80E−39 Oryza sativa similar to DNA-bin(japonica cultivar- group) 417 G2153 gi15528814 1.70E−36 Oryza sativahypothetical protein~similar to Arabidopsis 417 G2153 gi4165183 5.00E−21Antirrhinum majus SAP1 protein. 417 G2153 gi2213534 1.30E−19 Pisumsativum DNA-binding PD1-like protein. 417 G2153 gi7439981 2.60E−08Triticum aestivum glycine-rich RNA-binding protein GPR1 - 417 G2153gi21623 1.90E−06 Sorghum bicolor glycine-rich RNA-binding protein. 417G2153 gi11545668 3.50E−06 Chlamydomonas CIA5. reinhardtii 417 G2153gi21068672 6.60E−06 Cicer arietinum putative glicine-rich protein. 417G2153 gi7489714 6.60E−06 Zea mays aluminum-induced protein al1 - maize.417 G2153 gi395147 1.60E−05 Nicotiana tabacum glycine-rich protein. 419G2155 BG543096 2.00E−69 Brassica rapa subsp. E0571 Chinese cabbage etiolpekinensis 419 G2155 BH480897 7.00E−66 Brassica oleracea BOGRA01TF BOGRBrassica oleracea genomic 419 G2155 BG646893 2.00E−53 Medicagotruncatula EST508512 HOGA Medicago truncatula cDNA 419 G2155 BU0235703.00E−44 Helianthus annuus QHF11M19.yg.ab1 QH_EFGHJ sunflower RHA280 419G2155 AP004020 2.00E−41 Oryza sativa chromosome 2 clone OJ1119_A01, ***SEQUENCING 419 G2155 BI426899 4.00E−41 Glycine max sag08g12.y1 Gm-c1080Glycine max cDNA clone GEN 419 G2155 AAAA01000383 2.00E−40 Oryza sativa( indica ( ) scaffold000383 cultivar-group) 419 G2155 AP004971 2.00E−40Lotus japonicus genomic DNA, chromosome 5, clone: LjT45G21, 419 G2155AP005755 2.00E−40 Oryza sativa ( ) chromosome 9 clo (japonica cultivar-group) 419 G2155 BZ412041 8.00E−39 Zea mays OGACG56TC ZM_0.7_1.5_KB Zeamays genomic clone ZMM 419 G2155 gi15528814 3.70E−32 Oryza sativahypothetical protein~similar to Arabidopsis 419 G2155 gi240599791.20E−21 Oryza sativa similar to DNA-bin (japonica cultivar- group) 419G2155 gi4165183 3.50E−20 Antirrhinum majus SAP1 protein. 419 G2155gi2213534 1.60E−16 Pisum sativum DNA-binding PD1-like protein. 419 G2155gi2224911 0.98 Daucus carota somatic embryogenesis receptor-like kinase.419 G2155 gi454279 1 Avena sativa DNA-binding protein. 439 G2509BH989379 8.00E−66 Brassica oleracea oed22b05.b1 B.oleracea002 Brassicaolerac 439 G2509 BQ138607 4.00E−41 Medicago truncatula NF005C01PH1F1004Phoma-infected Medicag 439 G2509 BQ786702 4.00E−36 Glycine maxsaq72b07.y1 Gm-c1076 Glycine max cDNA clone SOY 439 G2509 OSJN002407.00E−31 Oryza sativa genomic DNA, chromosome 4, BAC clone: OSJNBa0 439G2509 AAAA01000832 7.00E−31 Oryza sativa ( indica ( ) scaffold000832cultivar-group) 439 G2509 BE419451 2.00E−29 Triticum aestivumWWS012.C2R000101 ITEC WWS Wheat Scutellum 439 G2509 BM062508 5.00E−29Capsicum annuum KS01043F09 KS01 Capsicum annuum cDNA, mRNA 439 G2509AI771755 2.00E−28 Lycopersicon EST252855 tomato ovary, esculentum TAMULycope 439 G2509 CA015575 7.00E−28 Hordeum vulgare HT14L19r HT Hordeumsubsp. vulgare vulgare 439 G2509 BE642320 2.00E−27 Ceratopterisrichardii Cri2_5_L17_SP6 Ceratopteris Spore Li 439 G2509 gi201608542.10E−29 Oryza sativa hypothetical prote (japonica cultivar- group) 439G2509 gi3264767 8.40E−28 Prunus armeniaca AP2 domain containing protein.439 G2509 gi24817250 1.10E−25 Cicer arietinum transcription factorEREBP- like protein. 439 G2509 gi15217291 7.10E−25 Oryza sativa PutativeAP2 domain containing protein. 439 G2509 gi1208498 1.60E−24 Nicotianatabacum EREBP-2. 439 G2509 gi8809571 1.60E−24 Nicotiana sylvestrisethylene-responsive element binding 439 G2509 gi7528276 3.00E−24Mesembryanthemum AP2-related transcription f crystallinum 439 G2509gi1688233 1.10E−23 Solanum tuberosum DNA binding protein homolog. 439G2509 gi4099921 1.60E−23 Stylosanthes hamata EREBP-3 homolog. 439 G2509gi18496063 2.40E−23 Fagus sylvatica ethylene responsive element bindingprote 449 G2583 BH658452 1.00E−59 Brassica oleracea BOMCP74TF BO_2_3_KBBrassica oleracea gen 449 G2583 BE023297 5.00E−54 Glycine max sm80e10.y1Gm-c1015 Glycine max cDNA clone GENO 449 G2583 CA486875 1.00E−50Triticum aestivum WHE4337_A02_A03ZS Wheat meiotic anther cD 449 G2583BG642554 8.00E−48 Lycopersicon EST356031 tomato flower esculentum buds,anthe 449 G2583 BI978981 2.00E−47 Rosa chinensis zD09 Old Blush petalSMART library Rosa chin 449 G2583 BU978490 4.00E−47 Hordeum vulgareHA13G05r HA Hordeum subsp. vulgare vulgare 449 G2583 BQ106328 4.00E−46Rosa hybrid cultivar gg1388.e Rose Petals (Golden Gate) Lam 449 G2583BI958226 1.00E−44 Hordeum vulgare HVSMEn0013P17f Hordeum vulgare rachisEST 1 449 G2583 AP004869 1.00E−43 Oryza sativa ( ) chromosome 2 clo(japonica cultivar- group) 449 G2583 BU832200 6.00E−43 Populus tremula xPopulus T030G01 Populus apica tremuloides 449 G2583 gi18650662 2.30E−23Lycopersicon ethylene response factor 1. esculentum 449 G2583 gi1317547.30E−20 Lupinus polyphyllus PPLZ02 PROTEIN. 449 G2583 gi201608542.80E−18 Oryza sativa hypothetical prote (japonica cultivar- group) 449G2583 gi10798644 2.80E−18 Nicotiana tabacum AP2 domain-containingtranscription fac 449 G2583 gi8571476 2.80E−18 Atriplex hortensisapetala2 domain-containing protein. 449 G2583 gi14018047 3.30E−17 Oryzasativa Putative protein containing AP2 DNA binding 449 G2583 gi122258841.10E−16 Zea mays unnamed protein product. 449 G2583 gi3264767 1.10E−16Prunus armeniaca AP2 domain containing protein. 449 G2583 gi40999141.10E−16 Stylosanthes hamata ethylene-responsive element binding p 449G2583 gi8809573 1.40E−16 Nicotiana sylvestris ethylene-responsiveelement binding

Table 12 lists sequences discovered to be paralogous to a number oftranscription factors of the present invention. The columns headingsinclude, from left to right, the Arabidopsis SEQ ID NO; correspondingArabidopsis Gene ID (GID) numbers; the GID numbers of the paralogsdiscovered in a database search; and the SEQ ID NOs of the paralogs (aparalog appearing in any cell of the fourth column is also paralogous tothe other sequences in that cell).

TABLE 12 Arabidopsis Transcription Factors and Paralogs SEQ ID GID NO:NO. Paralog SEQ ID NO: Paralog GID No. 10 G28 6, 2074 G6, G1006 12 G47408 G2133 60 G353 62, 2150, 2156, 2200 G354, G1889, G1974, G2839 88 G48190, 2010, 2102, 2172 G482, G485, G1364, G2345 94 G489 2054 G714 148 G68238, 1972, 2142, 2192 G225, G226, G1816, G2718 170 G867 1950, 2072, 370G9, G993, G1930 186 G912 1958, 1960, 1962, G40, G41, G42, G2107, 2162,2184 G2513 188 G913 2162 G2107 200 G975 2106, 450 G1387, G2583 224 G10732078, 2166 G1067, G2156 226 G1075 2080 G1076 270 G1411 440 G2509 278G1451 2070 G990 332 G1792 1954, 2134, 2136 G30, G1791, G1795 344 G1836340 G1818 2158 G1995 64, 66, 2198 G361, G362, G2838 420 G2155 2154 G1945

Table 13 lists the gene identification number (GID) and homologousrelationships found using analyses according to Example VIII for thesequences of the Sequence Listing.

TABLE 13 Homologous relationships found within the Sequence Listing DNAor Species from Which SEQ ID Protein Homologous NO: GID (PRT) Sequenceis Derived Relationship of SEQ ID NO: to Other Genes 468 DNA Glycine maxPredicted polypeptide sequence is orthologous to G19 469 DNA Glycine maxPredicted polypeptide sequence is orthologous to G19 470 DNA Glycine maxPredicted polypeptide sequence is orthologous to G19 471 DNA Glycine maxPredicted polypeptide sequence is orthologous to G19 472 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G19 473 DNAOryza sativa Predicted polypeptide sequence is orthologous to G19 474DNA Oryza sativa Predicted polypeptide sequence is orthologous to G19475 DNA Zea mays Predicted polypeptide sequence is orthologous to G19476 DNA Zea mays Predicted polypeptide sequence is orthologous to G19477 DNA Glycine max Predicted polypeptide sequence is orthologous toG22, G28, G1006 478 DNA Glycine max Predicted polypeptide sequence isorthologous to G22, G28, G1006 491 DNA Glycine max Predicted polypeptidesequence is orthologous to G22, G28, G1006 492 DNA Glycine max Predictedpolypeptide sequence is orthologous to G22, G28, G1006 493 DNA Glycinemax Predicted polypeptide sequence is orthologous to G22, G28, G1006 494DNA Glycine max Predicted polypeptide sequence is orthologous to G22,G28, G1006 495 DNA Glycine max Predicted polypeptide sequence isorthologous to G22, G28, G1006 496 DNA Glycine max Predicted polypeptidesequence is orthologous to G22, G28, G1006 497 DNA Glycine max Predictedpolypeptide sequence is orthologous to G22, G28, G1006 498 DNA Glycinemax Predicted polypeptide sequence is orthologous to G22, G28, G1006 499DNA Oryza sativa Predicted polypeptide sequence is orthologous to G22,G28, G1006 500 DNA Zea mays Predicted polypeptide sequence isorthologous to G22, G28, G1006 501 PRT Oryza sativa Orthologous to G22,G28, G1006 502 PRT Oryza sativa Orthologous to G22, G28, G1006 503 PRTMesembryanthemum Orthologous to G22, G28, G1006 crystallinum 504 DNAGlycine max Predicted polypeptide sequence is orthologous to G47, G2133505 PRT Oryza sativa Orthologous to G47, G2133 550 DNA Glycine maxPredicted polypeptide sequence is orthologous to G226, G682, G1816,G2718 551 DNA Glycine max Predicted polypeptide sequence is orthologousto G226, G682, G1816, G2718 552 DNA Glycine max Predicted polypeptidesequence is orthologous to G226, G682, G1816, G2718 553 DNA Glycine maxPredicted polypeptide sequence is orthologous to G226, G682, G1816,G2718 554 DNA Glycine max Predicted polypeptide sequence is orthologousto G226, G682, G1816, G2718 555 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G226, G682, G1816, G2718 556 DNA Zea maysPredicted polypeptide sequence is orthologous to G226, G682, G1816,G2718 557 DNA Zea mays Predicted polypeptide sequence is orthologous toG226, G682, G1816, G2718 558 PRT Oryza sativa Orthologous to G226, G682,G1816, G2718 559 PRT Oryza sativa Orthologous to G226, G682, G1816,G2718 610 DNA Glycine max Predicted polypeptide sequence is orthologousto G353, G354, G1974, G1889, G2839 611 DNA Glycine max Predictedpolypeptide sequence is orthologous to G353, G354, G1974, G1889, G2839612 DNA Glycine max Predicted polypeptide sequence is orthologous toG353, G354, G1974, G1889, G2839 613 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G353, G354, G1974, G1889, G2839614 DNA Zea mays Predicted polypeptide sequence is orthologous to G353,G354, G1974, G1889, G2839 615 DNA Zea mays Predicted polypeptidesequence is orthologous to G353, G354, G1974, G1889, G2839 616 DNA Zeamays Predicted polypeptide sequence is orthologous to G353, G354, G1974,G1889, G2839 617 DNA Zea mays Predicted polypeptide sequence isorthologous to G353, G354, G1974, G1889, G2839 618 DNA Zea maysPredicted polypeptide sequence is orthologous to G353, G354, G1974,G1889, G2839 619 DNA Zea mays Predicted polypeptide sequence isorthologous to G353, G354, G1974, G1889, G2839 620 DNA Zea maysPredicted polypeptide sequence is orthologous to G353, G354, G1974,G1889, G2839 621 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 622 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 623 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 624 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 625 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 626 PRT Oryza sativa Orthologous to G353, G354, G1974,G1889, G2839 746 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 747 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 748 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 749 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 750 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 751 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 752 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 753 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 754 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 755 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 756 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 757 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 758 DNA Zea mays Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 759 DNA Zea mays Predictedpolypeptide sequence is orthologous to G481, G482, G485, G1364, G2345760 DNA Zea mays Predicted polypeptide sequence is orthologous to G481,G482, G485, G1364, G2345 761 DNA Zea mays Predicted polypeptide sequenceis orthologous to G481, G482, G485, G1364, G2345 762 DNA Zea maysPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 763 DNA Zea mays Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 764 DNA Zea mays Predictedpolypeptide sequence is orthologous to G481, G482, G485, G1364, G2345765 DNA Zea mays Predicted polypeptide sequence is orthologous to G481,G482, G485, G1364, G2345 766 DNA Zea mays Predicted polypeptide sequenceis orthologous to G481, G482, G485, G1364, G2345 767 DNA Zea maysPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 768 DNA Gossypium arboreum Predicted polypeptide sequenceis orthologous to G481, G482, G485, G1364, G2345 769 DNA Glycine maxPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 770 DNA Gossypium hirsutum Predicted polypeptide sequenceis orthologous to G481, G482, G485, G1364, G2345 771 DNA LycopersiconPredicted polypeptide sequence is esculentum orthologous to G481, G482,G485, G1364, G2345 772 DNA Lycopersicon Predicted polypeptide sequenceis esculentum orthologous to G481, G482, G485, G1364, G2345 773 DNAMedicago truncatula Predicted polypeptide sequence is orthologous toG481, G482, G485, G1364, G2345 774 DNA Lycopersicon Predictedpolypeptide sequence is esculentum orthologous to G481, G482, G485,G1364, G2345 775 DNA Solanum tuberosum Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 776 DNA Triticum aestivumPredicted polypeptide sequence is orthologous to G481, G482, G485,G1364, G2345 777 DNA Hordeum vulgare Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 778 DNA Triticum Predictedpolypeptide sequence is monococcum orthologous to G481, G482, G485,G1364, G2345 779 DNA Glycine max Predicted polypeptide sequence isorthologous to G481, G482, G485, G1364, G2345 780 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 781 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 782 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 783 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 784 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 785 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 786 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 787 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 788 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 789 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 790 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 791 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 792 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 793 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 794 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 795 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 796 PRT Oryza sativaOrthologous to G481, G482, G485, G1364, G2345 797 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 798 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 799 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 800 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 801 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 802 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 803 PRT Glycine maxOrthologous to G481, G482, G485, G1364, G2345 804 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 805 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 806 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 807 PRT Zea maysOrthologous to G481, G482, G485, G1364, G2345 825 DNA Glycine maxPredicted polypeptide sequence is orthologous to G489, G714 826 DNAGlycine max Predicted polypeptide sequence is orthologous to G489, G714827 DNA Glycine max Predicted polypeptide sequence is orthologous toG489, G714 828 DNA Glycine max Predicted polypeptide sequence isorthologous to G489, G714 829 DNA Glycine max Predicted polypeptidesequence is orthologous to G489, G714 830 DNA Glycine max Predictedpolypeptide sequence is orthologous to G489, G714 831 DNA Glycine maxPredicted polypeptide sequence is orthologous to G489, G714 832 DNAOryza sativa Predicted polypeptide sequence is orthologous to G489, G714833 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG489, G714 834 DNA Zea mays Predicted polypeptide sequence isorthologous to G489, G714 835 PRT Oryza sativa Orthologous to G489, G714836 PRT Oryza sativa Orthologous to G489, G714 837 PRT Oryza sativaOrthologous to G489, G714 981 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G634 982 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G634 983 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G634 984 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 985 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 986 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 987 PRT Oryzasativa Orthologous to G634 988 PRT Oryza sativa Orthologous to G634 1076DNA Glycine max Predicted polypeptide sequence is orthologous to G226,G682, G1816, G2718 1077 DNA Hordeum vulgare Predicted polypeptidesequence is subsp. vulgare orthologous to G226, G682, G1816, G2718 1078DNA Populus tremula x Populus Predicted polypeptide sequence istremuloides orthologous to G226, G682, G1816, G2718 1079 DNA Triticumaestivum Predicted polypeptide sequence is orthologous to G226, G682,G1816, G2718 1080 DNA Gossypium arboreum Predicted polypeptide sequenceis orthologous to G226, G682, G1816, G2718 1081 PRT Oryza sativaOrthologous to G226, G682, G1816, G2718 1082 PRT Oryza sativaOrthologous to G226, G682, G1816, G2718 1083 PRT Glycine max Orthologousto G226, G682, G1816, G2718 1084 PRT Glycine max Orthologous to G226,G682, G1816, G2718 1085 PRT Glycine max Orthologous to G226, G682,G1816, G2718 1086 PRT Glycine max Orthologous to G226, G682, G1816,G2718 1087 PRT Glycine max Orthologous to G226, G682, G1816, G2718 1088PRT Glycine max Orthologous to G226, G682, G1816, G2718 1089 PRT Zeamays Orthologous to G226, G682, G1816, G2718 1090 PRT Zea maysOrthologous to G226, G682, G1816, G2718 1159 DNA Glycine max Predictedpolypeptide sequence is orthologous to G9, G867, G993, G1930 1160 DNAGlycine max Predicted polypeptide sequence is orthologous to G9, G867,G993, G1930 1161 DNA Glycine max Predicted polypeptide sequence isorthologous to G9, G867, G993, G1930 1162 DNA Glycine max Predictedpolypeptide sequence is orthologous to G9, G867, G993, G1930 1163 DNAGlycine max Predicted polypeptide sequence is orthologous to G9, G867,G993, G1930 1164 DNA Glycine max Predicted polypeptide sequence isorthologous to G9, G867, G993, G1930 1165 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G9, G867, G993, G1930 1166 DNAOryza sativa Predicted polypeptide sequence is orthologous to G9, G867,G993, G1930 1167 DNA Zea mays Predicted polypeptide sequence isorthologous to G9, G867, G993, G1930 1168 DNA Zea mays Predictedpolypeptide sequence is orthologous to G9, G867, G993, G1930 1169 DNAZea mays Predicted polypeptide sequence is orthologous to G9, G867,G993, G1930 1170 DNA Zea mays Predicted polypeptide sequence isorthologous to G9, G867, G993, G1930 1171 DNA Glycine max Predictedpolypeptide sequence is orthologous to G9, G867, G993, G1930 1172 DNAMesembryanthemum Predicted polypeptide sequence is crystallinumorthologous to G9, G867, G993, G1930 1173 DNA Lycopersicon Predictedpolypeptide sequence is esculentum orthologous to G9, G867, G993, G19301174 DNA Solanum tuberosum Predicted polypeptide sequence is orthologousto G9, G867, G993, G1930 1175 DNA Hordeum vulgare Predicted polypeptidesequence is orthologous to G9, G867, G993, G1930 1176 PRT Oryza sativaOrthologous to G9, G867, G993, G1930 1177 PRT Oryza sativa Orthologousto G9, G867, G993, G1930 1178 PRT Oryza sativa Orthologous to G9, G867,G993, G1930 1179 PRT Oryza sativa Orthologous to G9, G867, G993, G19301180 PRT Oryza sativa Orthologous to G9, G867, G993, G1930 1181 PRTOryza sativa Orthologous to G9, G867, G993, G1930 1182 PRT Glycine maxOrthologous to G9, G867, G993, G1930 1183 PRT Glycine max Orthologous toG9, G867, G993, G1930 1184 PRT Glycine max Orthologous to G9, G867,G993, G1930 1185 PRT Zea mays Orthologous to G9, G867, G993, G1930 1186PRT Zea mays Orthologous to G9, G867, G993, G1930 1204 DNA Glycine maxPredicted polypeptide sequence is orthologous to G912, G2107, G2513 1205DNA Glycine max Predicted polypeptide sequence is orthologous to G912,G2107, G2513 1206 DNA Glycine max Predicted polypeptide sequence isorthologous to G912, G2107, G2513 1207 DNA Glycine max Predictedpolypeptide sequence is orthologous to G912, G2107, G2513 1208 DNAGlycine max Predicted polypeptide sequence is orthologous to G912,G2107, G2513 1209 DNA Glycine max Predicted polypeptide sequence isorthologous to G912, G2107, G2513 1210 DNA Glycine max Predictedpolypeptide sequence is orthologous to G912, G2107, G2513 1211 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G912, G2107,G2513 1212 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G912, G913 1213 DNA Zea mays Predicted polypeptidesequence is orthologous to G912, G2107, G2513 1214 DNA Zea maysPredicted polypeptide sequence is orthologous to G912, G2107, G2513 1215DNA Zea mays Predicted polypeptide sequence is orthologous to G912, G9131216 DNA Zea mays Predicted polypeptide sequence is orthologous to G912,G2107, G2513 1217 DNA Zea mays Predicted polypeptide sequence isorthologous to G912, G2107, G2513 1218 DNA Brassica napus Predictedpolypeptide sequence is orthologous to G912, G913 1219 DNA Solanumtuberosum Predicted polypeptide sequence is orthologous to G912, G2107,G2513 1220 DNA Descurainia sophia Predicted polypeptide sequence isorthologous to G912, G2107, G2513 1221 PRT Oryza sativa Orthologous toG912, G2107, G2513 1222 PRT Oryza sativa Orthologous to G912, G913 1223PRT Oryza sativa Orthologous to G912, G913 1224 PRT Oryza sativaOrthologous to G912, G2107, G2513 1225 PRT Brassica napus Orthologous toG912, G2107, G2513 1226 PRT Nicotiana tabacum Orthologous to G912,G2107, G2513 1227 PRT Oryza sativa Orthologous to G912, G2107, G25131228 PRT Oryza sativa Orthologous to G912, G2107, G2513 1229 PRT Oryzasativa Orthologous to G912, G2107, G2513 1230 PRT Oryza sativaOrthologous to G912, G2107, G2513 1231 PRT Oryza sativa Orthologous toG912, G2107, G2513 1232 PRT Oryza sativa Orthologous to G912, G2107,G2513 1233 PRT Oryza sativa Orthologous to G912, G2107, G2513 1234 PRTOryza sativa Orthologous to G912, G2107, G2513 1235 PRT Oryza sativaOrthologous to G912, G2107, G2513 1236 PRT Oryza sativa Orthologous toG912, G2107, G2513 1237 PRT Glycine max Orthologous to G912, G2107,G2513 1238 PRT Glycine max Orthologous to G912, G2107, G2513 1239 PRTGlycine max Orthologous to G912, G2107, G2513 1240 PRT Glycine maxOrthologous to G912, G2107, G2513 1241 PRT Glycine max Orthologous toG912, G2107, G2513 1242 PRT Glycine max Orthologous to G912, G2107,G2513 1243 PRT Glycine max Orthologous to G912, G2107, G2513 1244 PRTZea mays Orthologous to G912, G2107, G2513 1245 PRT Zea mays Orthologousto G912, G2107, G2513 1246 PRT Zea mays Orthologous to G912, G2107,G2513 1247 PRT Zea mays Orthologous to G912, G2107, G2513 1248 PRT Zeamays Orthologous to G912, G2107, G2513 1249 DNA Glycine max Predictedpolypeptide sequence is orthologous to G922 1250 DNA Glycine maxPredicted polypeptide sequence is orthologous to G922 1251 DNA Glycinemax Predicted polypeptide sequence is orthologous to G922 1252 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G922 1253 DNAOryza sativa Predicted polypeptide sequence is orthologous to G922 1254PRT Oryza sativa Orthologous to G922 1255 PRT Oryza sativa Orthologousto G922 1256 PRT Oryza sativa Orthologous to G922 1257 PRT Oryza sativaOrthologous to G922 1258 DNA Glycine max Predicted polypeptide sequenceis orthologous to G926 1259 DNA Glycine max Predicted polypeptidesequence is orthologous to G926 1260 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G926 1261 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G926 1262 DNA Zea maysPredicted polypeptide sequence is orthologous to G926 1263 PRT Brassicanapus Orthologous to G926 1292 DNA Glycine max Predicted polypeptidesequence is orthologous to G975, G1387, G2583 1293 DNA Glycine maxPredicted polypeptide sequence is orthologous to G975, G1387, G2583 1294DNA Glycine max Predicted polypeptide sequence is orthologous to G975,G1387, G2583 1295 DNA Glycine max Predicted polypeptide sequence isorthologous to G975, G1387, G2583 1296 DNA Glycine max Predictedpolypeptide sequence is orthologous to G975, G1387, G2583 1297 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G975, G1387,G2583 1298 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G975, G1387, G2583 1299 DNA Zea mays Predictedpolypeptide sequence is orthologous to G975, G1387, G2583 1300 DNA Zeamays Predicted polypeptide sequence is orthologous to G975, G1387, G25831301 DNA Brassica rapa Predicted polypeptide sequence is orthologous toG975, G1387, G2583 1302 PRT Oryza sativa Orthologous to G975, G1387,G2583 1393 DNA Glycine max Predicted polypeptide sequence is orthologousto G1069, G2153 1394 DNA Glycine max Predicted polypeptide sequence isorthologous to G1069, G2153 1395 PRT Oryza sativa Orthologous to G1069,G2153 1396 DNA Zea mays Predicted polypeptide sequence is orthologous toG1069, G2153 1397 DNA Lotus japonicus Predicted polypeptide sequence isorthologous to G1069, G2153 1398 DNA Lycopersicon Predicted polypeptidesequence is esculentum orthologous to G1067, G1073, G2156 1399 PRT Oryzasativa Orthologous to G1067, G1073, G2156 1400 PRT Oryza sativaOrthologous to G1067, G1073, G2156 1401 PRT Oryza sativa Orthologous toG1067, G1073, G2156 1402 PRT Oryza sativa Orthologous to G1067, G1073,G2156 1403 PRT Oryza sativa Orthologous to G1067, G1073, G2156 1404 PRTOryza sativa Orthologous to G1067, G1073, G2156 1405 PRT Oryza sativaOrthologous to G1067, G1073, G2156 1406 PRT Oryza sativa Orthologous toG1067, G1073, G2156 1407 PRT Oryza sativa Orthologous to G1067, G1073,G2156 1408 PRT Oryza sativa Orthologous to G1067, G1073, G2156 1409 PRTOryza sativa Orthologous to G1067, G1073, G2156 1410 PRT Oryza sativaOrthologous to G1067, G1073, G2156 1411 PRT Glycine max Orthologous toG1067, G1073, G2156 1412 PRT Glycine max Orthologous to G1067, G1073,G2156 1413 PRT Glycine max Orthologous to G1067, G1073, G2156 1414 PRTGlycine max Orthologous to G1067, G1073, G2156 1415 PRT Glycine maxOrthologous to G1067, G1073, G2156 1416 PRT Glycine max Orthologous toG1067, G1073, G2156 1417 PRT Glycine max Orthologous to G1067, G1073,G2156 1418 PRT Zea mays Orthologous to G1067, G1073, G2156 1419 DNAGlycine max Predicted Polypeptide sequence is orthologous to G1075,G1076 1420 DNA Glycine max Predicted polypeptide sequence is orthologousto G1075, G1076 1421 DNA Glycine max Predicted polypeptide sequence isorthologous to G1075, G1076 1422 DNA Glycine max Predicted polypeptidesequence is orthologous to G1075, G1076 1423 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1075, G1076 1424 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G1075, G10761425 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG1075, G1076 1426 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G1075, G1076 1587 DNA Glycine max Predicted polypeptidesequence is orthologous to G1411, G2509 1588 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1411, G2509 1589 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1411, G2509 1590 DNAGlycine max Predicted polypeptide sequence is orthologous to G1411,G2509 1591 DNA Zea mays Predicted polypeptide sequence is orthologous toG1411, G2509 1604 DNA Glycine max Predicted polypeptide sequence isorthologous to G990, G1451 1605 DNA Glycine max Predicted polypeptidesequence is orthologous to G990, G1451 1606 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G990, G1451 1607 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G990, G1451 1608 DNAOryza sativa Predicted polypeptide sequence is orthologous to G990,G1451 1609 DNA Zea mays Predicted polypeptide sequence is orthologous toG990, G1451 1610 DNA Zea mays Predicted polypeptide sequence isorthologous to G990, G1451 1611 DNA Zea mays Predicted polypeptidesequence is orthologous to G990, G1451 1612 DNA Zea mays Predictedpolypeptide sequence is orthologous to G990, G1451 1613 DNA Medicagotruncatula Predicted polypeptide sequence is orthologous to G990, G14511614 DNA Solanum tuberosum Predicted polypeptide sequence is orthologousto G990, G1451 1615 DNA Zea mays Predicted polypeptide sequence isorthologous to G990, G1451 1616 DNA Sorghum propinquum Predictedpolypeptide sequence is orthologous to G990, G1451 1617 DNA Glycine maxPredicted polypeptide sequence is orthologous to G990, G1451 1618 DNASorghum bicolor Predicted polypeptide sequence is orthologous to G990,G1451 1619 DNA Hordeum vulgare Predicted polypeptide sequence isorthologous to G990, G1451 1620 DNA Lycopersicon Predicted polypeptidesequence is esculentum orthologous to G990, G1451 1621 PRT Oryza sativaOrthologous to G990, G1451 1622 PRT Oryza sativa Orthologous to G990,G1451 1623 PRT Oryza sativa Orthologous to G990, G1451 1624 PRT Oryzasativa Orthologous to G990, G1451 1671 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1543 1672 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1543 1673 DNA Zea maysPredicted polypeptide sequence is orthologous to G1543 1674 PRT Oryzasativa Orthologous to G1543 1728 DNA Glycine max Predicted polypeptidesequence is orthologous to G30, G1791, G1792, G1795 1729 DNA Glycine maxPredicted polypeptide sequence is orthologous to G30, G1791, G1792,G1795 1730 DNA Glycine max Predicted polypeptide sequence is orthologousto G30, G1791, G1792, G1795 1731 DNA Glycine max Predicted polypeptidesequence is orthologous to G30, G1791, G1792, G1795 1732 DNA Glycine maxPredicted polypeptide sequence is orthologous to G30, G1791, G1792,G1795 1733 DNA Zea mays Predicted polypeptide sequence is orthologous toG30, G1791, G1792, G1795 1734 DNA Lycopersicon Predicted polypeptidesequence is esculentum orthologous to G30, G1791, G1792, G1795 1735G3380 PRT Oryza sativa Orthologous to G30, G1791, G1792, G1795 1736G3381 PRT Oryza sativa indica Orthologous to G30, G1791, G1792, G17951737 G3383 PRT Oryza sativa japonica Orthologous to G30, G1791, G1792,G1795 1795 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G9, G867, G993, G1930 1908 DNA Medicago truncatulaPredicted polypeptide sequence is orthologous to G1945, G2155 1909 DNAMedicago truncatula Predicted polypeptide sequence is orthologous toG1945, G2155 1910 DNA Glycine max Predicted polypeptide sequence isorthologous to G1945, G2155 1949 G9 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to G9, G867, G993, G1930 1950 G9 PRTArabidopsis thaliana Paralogous to G9, G867, G993, G1930 1953 G30 DNAArabidopsis thaliana Predicted polypeptide sequence is paralogous toG30, G1791, G1792, G1795 1954 G30 PRT Arabidopsis thaliana startParalogous to G30, G1791, G1792, G1795 1955 G40 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to G912, G2107, G2513 1956G40 PRT Arabidopsis thaliana Paralogous to G912, G2107, G2513 1957 G41DNA Arabidopsis thaliana Predicted polypeptide sequence is paralogous toG912, G2107, G2513 1958 G41 PRT Arabidopsis thaliana Paralogous to G912,G2107, G2513 1959 G42 DNA Arabidopsis thaliana Predicted polypeptidesequence is paralogous to G912, G2107, G2513 1960 G42 PRT Arabidopsisthaliana Paralogous to G912, G2107, G2513 1971 G225 DNA Arabidopsisthaliana Predicted polypeptide sequence is paralogous to G226, G682,G1816, G2718 1972 G225 PRT Arabidopsis thaliana Paralogous to G226,G682, G1816, G2718 1991 G370 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to G361, G362, G370, G1995, G2826,G2838 1992 G370 PRT Arabidopsis thaliana Paralogous to G361, G362, G370,G1995, G2826, G2838 2009 G485 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to 481, G482, G485, G1364, G2345 2010G485 PRT Arabidopsis thaliana Paralogous to 481, G482, G485, G1364,G2345 2053 G714 DNA Arabidopsis thaliana Predicted polypeptide sequenceis paralogous to G489, G714 2054 G714 PRT Arabidopsis thalianaParalogous to G489, G714 2069 G990 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to G990, G1451 2070 G990 PRTArabidopsis thaliana Paralogous to G990, G1451 2071 G993 DNA Arabidopsisthaliana Predicted polypeptide sequence is paralogous to G9, G867, G993,G1930 2072 G993 PRT Arabidopsis thaliana Paralogous to G9, G867, G993,G1930 2073 G1006 DNA Arabidopsis thaliana Predicted polypeptide sequenceis paralogous to G22, G28, G1006 2074 G1006 PRT Arabidopsis thalianaParalogous to G22, G28, G1006 2077 G1067 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to G1067, G1073, G2156 2078G1067 PRT Arabidopsis thaliana Paralogous to G1067, G1073, G2156 2079G1076 DNA Arabidopsis thaliana Predicted polypeptide sequence isparalogous to G1075, G1076 2080 G1076 PRT Arabidopsis thalianaParalogous to G1075, G1076 2101 G1364 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to 481, G482, G485, G1364, G2345 2102G1364 PRT Arabidopsis thaliana Paralogous to 481, G482, G485, G1364,G2345 2105 G1387 DNA Arabidopsis thaliana Predicted polypeptide sequenceis paralogous to G975, G1387, G2583 2106 G1387 PRT Arabidopsis thalianaParalogous to G975, G1387, G2583 2133 G1791 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to G30, G1791, G1792, G17952134 G1791 PRT Arabidopsis thaliana Paralogous to G30, G1791, G1792,G1795 2135 G1795 DNA Arabidopsis thaliana Predicted polypeptide sequenceis paralogous to G30, G1791, G1792, G1795 2136 G1795 PRT Arabidopsisthaliana Paralogous to G30, G1791, G1792, G1795 2141 G1816 DNAArabidopsis thaliana Predicted polypeptide sequence is paralogous toG226, G682, G1816, G2718 2142 G1816 PRT Arabidopsis thaliana Paralogousto G226, G682, G1816, G2718 2149 G1889 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to G353, G354, G1974,G1889, G2839 2150 G1889 PRT Arabidopsis thaliana Paralogous to G353,G354, G1974, G1889, G2839 2153 G1945 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to G1945, G2155 2154 G1945 PRTArabidopsis thaliana Paralogous to G1945, G2155 2155 G1974 DNAArabidopsis thaliana Predicted polypeptide sequence is paralogous toG353, G354, G1974, G1889, G2839 2156 G1974 PRT Arabidopsis thalianaParalogous to G353, G354, G1974, G1889, G2839 2157 G1995 DNA Arabidopsisthaliana Predicted polypeptide sequence is paralogous to G361, G362,G1995, G2826, G2838 2158 G1995 PRT Arabidopsis thaliana Paralogous toG361, G362, G1995, G2826, G2838 2161 G2107 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to G912, G2107, G2513 2162G2107 PRT Arabidopsis thaliana Paralogous to G912, G2107, G2513 2165G2156 DNA Arabidopsis thaliana Predicted polypeptide sequence isparalogous to G1067, G1073, G2156 2166 G2156 PRT Arabidopsis thalianaParalogous to G1067, G1073, G2156 2171 G2345 DNA Arabidopsis thalianaPredicted polypeptide sequence is paralogous to 481, G482, G485, G1364,G2345 2172 G2345 PRT Arabidopsis thaliana Paralogous to 481, G482, G485,G1364, G2345 2183 G2513 DNA Arabidopsis thaliana Predicted polypeptidesequence is paralogous to G912, G2107, G2513 2184 G2513 PRT Arabidopsisthaliana Paralogous to G912, G2107, G2513 2191 G2718 DNA Arabidopsisthaliana Predicted polypeptide sequence is paralogous to G226, G682,G1816, G2718 2192 G2718 PRT Arabidopsis thaliana Paralogous to G226,G682, G1816, G2718 2199 G2839 DNA Arabidopsis thaliana Predictedpolypeptide sequence is paralogous to G353, G354, G1974, G1889, G28392200 G2839 PRT Arabidopsis thaliana Paralogous to G353, G354, G1974,G1889, G2839Molecular Modeling

Another means that may be used to confirm the utility and function oftranscription factor sequences that are orthologous or paralogous topresently disclosed transcription factors is through the use ofmolecular modeling software. Molecular modeling is routinely used topredict polypeptide structure, and a variety of protein structuremodeling programs, such as “Insight II” (Accelrys, Inc.) arecommercially available for this purpose. Modeling can thus be used topredict which residues of a polypeptide can be changed without alteringfunction (Crameri et al. (2003) U.S. Pat. No. 6,521,453). Thus,polypeptides that are sequentially similar can be shown to have a highlikelihood of similar function by their structural similarity, whichmay, for example, be established by comparison of regions ofsuperstructure. The relative tendencies of amino acids to form regionsof superstructure (for example, helixes and β-sheets) are wellestablished. For example, O'Neil et al. (1990) Science 250: 646-651)have discussed in detail the helix forming tendencies of amino acids.Tables of relative structure forming activity for amino acids can beused as substitution tables to predict which residues can befunctionally substituted in a given region, for example, in DNA-bindingdomains of known transcription factors and equivalogs. Homologs that arelikely to be functionally similar can then be identified.

Of particular interest is the structure of a transcription factor in theregion of its conserved domain, such as those identified in Table 8.Structural analyses may be performed by comparing the structure of theknown transcription factor around its conserved domain with those oforthologs and paralogs. Analysis of a number of polypeptides within atranscription factor group or lade, including the functionally orsequentially similar polypeptides provided in the Sequence Listing, mayalso provide an understanding of structural elements required toregulate transcription within a given family.

EXAMPLES

The invention, now being generally described, will be more readilyunderstood by reference to the following examples, which are includedmerely for purposes of illustration of certain aspects and embodimentsof the present invention and are not intended to limit the invention. Itwill be recognized by one of skill in the art that a transcriptionfactor that is associated with a particular first trait may also beassociated with at least one other, unrelated and inherent second traitwhich was not predicted by the first trait.

The complete descriptions of the traits associated with eachpolynucleotide of the invention are fully disclosed in Table 7 and Table9. The complete description of the transcription factor gene family andidentified conserved domains of the polypeptide encoded by thepolynucleotide is fully disclosed in Table 8.

Example I Full Length Gene Identification and Cloning

Putative transcription factor sequences (genomic or ESTs) related toknown transcription factors were identified in the Arabidopsis thalianaGenBank database using the tblastn sequence analysis program usingdefault parameters and a P-value cutoff threshold of −4 or −5 or lower,depending on the length of the query sequence. Putative transcriptionfactor sequence hits were then screened to identify those containingparticular sequence strings. If the sequence hits contained suchsequence strings, the sequences were confirmed as transcription factors.

Alternatively, Arabidopsis thaliana cDNA libraries derived fromdifferent tissues or treatments, or genomic libraries were screened toidentify novel members of a transcription family using a low stringencyhybridization approach. Probes were synthesized using gene specificprimers in a standard PCR reaction (annealing temperature 60° C.) andlabeled with ³²P dCTP using the High Prime DNA Labeling Kit (BoehringerMannheim Corp. (now Roche Diagnostics Corp., Indianapolis, Ind.).Purified radiolabelled probes were added to filters immersed in Churchhybridization medium (0.5 M NaPO₄ pH 7.0, 7% SDS, 1% w/v bovine serumalbumin) and hybridized overnight at 60° C. with shaking. Filters werewashed two times for 45 to 60 minutes with 1×SCC, 1% SDS at 60° C.

To identify additional sequence 5′ or 3′ of a partial cDNA sequence in acDNA library, 5′ and 3′ rapid amplification of cDNA ends (RACE) wasperformed using the MARATHON cDNA amplification kit (Clontech, PaloAlto, Calif.). Generally, the method entailed first isolating poly(A)mRNA, performing first and second strand cDNA synthesis to generatedouble stranded cDNA, blunting cDNA ends, followed by ligation of theMARATHON Adaptor to the cDNA to form a library of adaptor-ligated dscDNA.

Gene-specific primers were designed to be used along with adaptorspecific primers for both 5′ and 3′ RACE reactions. Nested primers,rather than single primers, were used to increase PCR specificity. Using5′ and 3′ RACE reactions, 5′ and 3′ RACE fragments were obtained,sequenced and cloned. The process can be repeated until 5′ and 3′ endsof the full-length gene were identified. Then the full-length cDNA wasgenerated by PCR using primers specific to 5′ and 3′ ends of the gene byend-to-end PCR.

Example II Construction of Expression Vectors

For the experiments in Example XIII, two types of constructs were usedto modulate the activity of lead transcription factors and test theactivity of orthologs and paralogs in transgenic plants. These includeddirect promoter fusion constructs and two component transformationsystems.

For direct promoter fusion, expression of a single full-length wild-typeversion of a transcription factor polynucleotide sequence was driven byfusing the polynucleotide directly to a promoter. A number of differentpromoters may be used, such as the native promoter or that gene, or apromoter that drives tissue specific or conditional expression. In thedirect promoter fusion assays found in Example XIII, the CaMV 35Sconstitutive promoter was used. To clone the sequence into the vector,both pMEN20, derived from pMON316 (Sanders et al. (1987) Nucleic AcidsRes. 15:1543-1558), and the amplified DNA fragment (the sequence wasamplified from a genomic or cDNA library using primers specific tosequences upstream and downstream of the coding region) were digestedseparately with SalI and NotI restriction enzymes at 37° C. for 2 hours.The digestion products were subject to electrophoresis in a 0.8% agarosegel and visualized by ethidium bromide staining. The DNA fragmentscontaining the sequence and the linearized plasmid were excised andpurified by using a QIAQUICK gel extraction kit (Qiagen, ValenciaCalif.). The fragments of interest were ligated at a ratio of 3:1(vector to insert). Ligation reactions using T4 DNA ligase (New EnglandBiolabs, Beverly Mass.) were carried out at 16° C. for 16 hours. Theligated DNAs were transformed into competent cells of the E. coli strainDH5alpha by using the heat shock method. The transformations were platedon LB plates containing 50 mg/l kanamycin (Sigma Chemical Co. St. LouisMo.). Individual colonies were grown overnight in five milliliters of LBbroth containing 50 mg/l kanamycin at 37° C. Plasmid DNA was purified byusing Qiaquick Mini Prep kits (Qiagen).

For two component supertransformation (2 comp. supTfn), two separateconstructs were used: Promoter::LexA-GAL4TA and opLexA::TF. The first ofthese (Promoter::LexA-GAL4TA) comprised a desired promoter cloned infront of a LexA DNA binding domain fused to a GAL4 activation domain.The construct vector backbone (MEN48; P5375) also carried a kanamycinresistance marker, along with an opLexA::GFP reporter. Transgenic lineswere obtained containing this first component, and a line was selectedthat showed reproducible expression of the reporter gene in the desiredpattern through a number of generations. A homozygous population wasestablished for that line, and the population was supertransformed withthe second construct (opLexA::TF) carrying the transcription factor ofinterest cloned behind a LexA operator site. This second constructvector backbone (pMEN53; P5381) also contained a sulfonamide resistancemarker.

Example III Transformation of Agrobacterium with the Expression Vector

After the plasmid vector containing the gene was constructed, the vectorwas used to transform Agrobacterium tumefaciens cells expressing thegene products. The stock of Agrobacterium tumefaciens cells fortransformation was made as described by Nagel et al. (1990) FEMSMicrobiol Letts. 67: 325-328. Agrobacterium strain ABI was grown in 250ml LB medium (Sigma) overnight at 28° C. with shaking until anabsorbance over 1 cm at 600 nm (A₆₀₀) of 0.5-1.0 was reached. Cells wereharvested by centrifugation at 4,000×g for 15 min at 4° C. Cells werethen resuspended in 250 μl chilled buffer (1 mM HEPES, pH adjusted to7.0 with KOH). Cells were centrifuged again as described above andresuspended in 125 μl chilled buffer. Cells were then centrifuged andresuspended two more times in the same HEPES buffer as described aboveat a volume of 100 μl and 750 μl, respectively. Resuspended cells werethen distributed into 40 μl aliquots, quickly frozen in liquid nitrogen,and stored at −80° C.

Agrobacterium cells were transformed with plasmids prepared as describedabove following the protocol described by Nagel et al. (supra). For eachDNA construct to be transformed, 50-100 ng DNA (generally resuspended in10 mM Tris-HCl, 1 mM EDTA, pH 8.0) was mixed with 40 μl of Agrobacteriumcells. The DNA/cell mixture was then transferred to a chilled cuvettewith a 2 mm electrode gap and subject to a 2.5 kV charge dissipated at25 μF and 200 μF using a Gene Pulser II apparatus (Bio-Rad, Hercules,Calif.). After electroporation, cells were immediately resuspended in1.0 ml LB and allowed to recover without antibiotic selection for 2-4hours at 28° C. in a shaking incubator. After recovery, cells wereplated onto selective medium of LB broth containing 100 μg/mlspectinomycin (Sigma) and incubated for 24-48 hours at 28° C. Singlecolonies were then picked and inoculated in fresh medium. The presenceof the plasmid construct was verified by PCR amplification and sequenceanalysis.

Example IV Transformation of Arabidopsis Plants with AgrobacteriumTumefaciens with Expression Vector

After transformation of Agrobacterium tumefaciens with plasmid vectorscontaining the gene, single Agrobacterium colonies were identified,propagated, and used to transform Arabidopsis plants. Briefly, 500 mlcultures of LB medium containing 50 mg/l kanamycin were inoculated withthe colonies and grown at 28° C. with shaking for 2 days until anoptical absorbance at 600 nm wavelength over 1 cm (A₆₀₀) of >2.0 isreached. Cells were then harvested by centrifugation at 4,000×g for 10min, and resuspended in infiltration medium (½× Murashige and Skoogsalts (Sigma), 1× Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose(Sigma), 0.044 μM benzylamino purine (Sigma), 200 μl/l Silwet L-77(Lehle Seeds) until an A₆₀₀ of 0.8 was reached.

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia)were sown at a density of −10 plants per 4″ pot onto Pro-Mix BX pottingmedium (Hummert International) covered with fiberglass mesh (18 mm×16mm). Plants were grown under continuous illumination (50-75 μE/m²/sec)at 22-23° C. with 65-70% relative humidity. After about 4 weeks, primaryinflorescence stems (bolts) are cut off to encourage growth of multiplesecondary bolts. After flowering of the mature secondary bolts, plantswere prepared for transformation by removal of all siliques and openedflowers.

The pots were then immersed upside down in the mixture of Agrobacteriuminfiltration medium as described above for 30 sec, and placed on theirsides to allow draining into a 1′×2′ flat surface covered with plasticwrap. After 24 h, the plastic wrap was removed and pots are turnedupright. The immersion procedure was repeated one week later, for atotal of two immersions per pot. Seeds were then collected from eachtransformation pot and analyzed following the protocol described below.

Example V Identification of Arabidopsis Primary Transformants

Seeds collected from the transformation pots were sterilized essentiallyas follows. Seeds were dispersed into in a solution containing 0.1%(v/v) Triton X-100 (Sigma) and sterile water and washed by shaking thesuspension for 20 min. The wash solution was then drained and replacedwith fresh wash solution to wash the seeds for 20 min with shaking.After removal of the ethanol/detergent solution, a solution containing0.1% (v/v) Triton X-100 and 30% (v/v) bleach (CLOROX; Clorox Corp.Oakland Calif.) was added to the seeds, and the suspension was shakenfor 10 min. After removal of the bleach/detergent solution, seeds werethen washed five times in sterile distilled water. The seeds were storedin the last wash water at 4° C. for 2 days in the dark before beingplated onto antibiotic selection medium (1× Murashige and Skoog salts(pH adjusted to 5.7 with 1M KOH), 1× Gamborg's B-5 vitamins, 0.9%phytagar (Life Technologies), and 50 mg/l kanamycin). Seeds weregerminated under continuous illumination (50-75 μE/m²/sec) at 22-23° C.After 7-10 days of growth under these conditions, kanamycin resistantprimary transformants (T₁ generation) were visible and obtained. Theseseedlings were transferred first to fresh selection plates where theseedlings continued to grow for 3-5 more days, and then to soil (Pro-MixBX potting medium).

Primary transformants were crossed and progeny seeds (T₂) collected;kanamycin resistant seedlings were selected and analyzed. The expressionlevels of the recombinant polynucleotides in the transformants vary fromabout a 5% expression level increase to a least a 100% expression levelincrease. Similar observations are made with respect to polypeptidelevel expression.

Example VI Identification of Arabidopsis Plants with TranscriptionFactor Gene Knockouts

The screening of insertion mutagenized Arabidopsis collections for nullmutants in a known target gene was essentially as described in Krysan etal. (1999) Plant Cell 11: 2283-2290. Briefly, gene-specific primers,nested by 5-250 base pairs to each other, were designed from the 5′ and3′ regions of a known target gene. Similarly, nested sets of primerswere also created specific to each of the T-DNA or transposon ends (the“right” and “left” borders). All possible combinations of gene specificand T-DNA/transposon primers were used to detect by PCR an insertionevent within or close to the target gene. The amplified DNA fragmentswere then sequenced which allows the precise determination of theT-DNA/transposon insertion point relative to the target gene. Insertionevents within the coding or intervening sequence of the genes weredeconvoluted from a pool comprising a plurality of insertion events to asingle unique mutant plant for functional characterization. The methodis described in more detail in Yu and Adam, U.S. application Ser. No.09/177,733 filed Oct. 23, 1998.

Example VII Identification of Modified Phenotypes in Overexpression orGene Knockout Plants

Experiments were performed to identify those transformants or knockoutsthat exhibited modified biochemical characteristics. Among thebiochemicals that were assayed were insoluble sugars, such as arabinose,fucose, galactose, mannose, rhamnose or xylose or the like; prenyllipids, such as lutein, beta-carotene, xanthophyll-1, xanthophyll-2,chlorophylls A or B, or alpha-, delta- or gamma-tocopherol or the like;fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic acid), 18:0(stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0, 18:3(linolenic acid), 20:1 (eicosenoic acid), 20:2, 22:1 (erucic acid) orthe like; waxes, such as by altering the levels of C29, C31, or C33alkanes; sterols, such as brassicasterol, campesterol, stigmasterol,sitosterol or stigmastanol or the like, glucosinolates, protein or oillevels.

Fatty acids were measured using two methods depending on whether thetissue was from leaves or seeds. For leaves, lipids were extracted andesterified with hot methanolic H₂SO₄ and partitioned into hexane frommethanolic brine. For seed fatty acids, seeds were pulverized andextracted in methanol:heptane:toluene:2,2-dimethoxypropane:H₂SO₄(39:34:20:5:2) for 90 minutes at 80° C. After cooling to roomtemperature the upper phase, containing the seed fatty acid esters, wassubjected to GC analysis. Fatty acid esters from both seed and leaftissues were analyzed with a SUPELCO SP-2330 column (Supelco,Bellefonte, Pa.).

Glucosinolates were purified from seeds or leaves by first heating thetissue at 95° C. for 10 minutes. Preheated ethanol:water (50:50) isadded and after heating at 95° C. for a further 10 minutes, theextraction solvent is applied to a DEAE Sephadex column (Pharmacia)which had been previously equilibrated with 0.5 M pyridine acetate.Desulfoglucosinolates were eluted with 300 ul water and analyzed byreverse phase HPLC monitoring at 226 nm.

For wax alkanes, samples were extracted using an identical method asfatty acids and extracts were analyzed on a HP 5890 GC coupled with a5973 MSD. Samples were chromatographically isolated on a J&W DB35 massspectrometer (J&W Scientific Agilent Technologies, Folsom, Calif.).

To measure prenyl lipid levels, seeds or leaves were pulverized with 1to 2% pyrogallol as an antioxidant. For seeds, extracted samples werefiltered and a portion removed for tocopherol and carotenoid/chlorophyllanalysis by HPLC. The remaining material was saponified for steroldetermination. For leaves, an aliquot was removed and diluted withmethanol and chlorophyll A, chlorophyll B, and total carotenoidsmeasured by spectrophotometry by determining optical absorbance at 665.2nm, 652.5 nm, and 470 nm. An aliquot was removed for tocopherol andcarotenoid/chlorophyll composition by HPLC using a Waters μBondapak C18column (4.6 mm×150 mm). The remaining methanolic solution was saponifiedwith 10% KOH at 80° C. for one hour. The samples were cooled and dilutedwith a mixture of methanol and water. A solution of 2% methylenechloride in hexane was mixed in and the samples were centrifuged. Theaqueous methanol phase was again re-extracted 2% methylene chloride inhexane and, after centrifugation, the two upper phases were combined andevaporated. 2% methylene chloride in hexane was added to the tubes andthe samples were then extracted with one ml of water. The upper phasewas removed, dried, and resuspended in 400 ul of 2% methylene chloridein hexane and analyzed by gas chromatography using a 50 m DB-5 ms (0.25mm ID, 0.25 um phase, J&W Scientific).

Insoluble sugar levels were measured by the method essentially describedby Reiter et al. (1999), Plant J. 12: 335-345. This method analyzes theneutral sugar composition of cell wall polymers found in Arabidopsisleaves. Soluble sugars were separated from sugar polymers by extractingleaves with hot 70% ethanol. The remaining residue containing theinsoluble polysaccharides was then acid hydrolyzed with allose added asan internal standard. Sugar monomers generated by the hydrolysis werethen reduced to the corresponding alditols by treatment with NaBH4, thenwere acetylated to generate the volatile alditol acetates which werethen analyzed by GC-FID. Identity of the peaks was determined bycomparing the retention times of known sugars converted to thecorresponding alditol acetates with the retention times of peaks fromwild-type plant extracts. Alditol acetates were analyzed on a SupelcoSP-2330 capillary column (30 m×250 μm×0.2 μm) using a temperatureprogram beginning at 180° C. for 2 minutes followed by an increase to220° C. in 4 minutes. After holding at 220° C. for 10 minutes, the oventemperature is increased to 240° C. in 2 minutes and held at thistemperature for 10 minutes and brought back to room temperature.

To identify plants with alterations in total seed oil or proteincontent, 150 mg of seeds from T2 progeny plants were subjected toanalysis by Near Infrared Reflectance Spectroscopy (NIRS) using a FossNirSystems Model 6500 with a spinning cup transport system. NIRS is anon-destructive analytical method used to determine seed oil and proteincomposition. Infrared is the region of the electromagnetic spectrumlocated after the visible region in the direction of longer wavelengths.‘Near infrared’ owns its name for being the infrared region near to thevisible region of the electromagnetic spectrum. For practical purposes,near infrared comprises wavelengths between 800 and 2500 nm. NIRS isapplied to organic compounds rich in O—H bonds (such as moisture,carbohydrates, and fats), C—H bonds (such as organic compounds andpetroleum derivatives), and N—H bonds (such as proteins and aminoacids). The NIRS analytical instruments operate by statisticallycorrelating NIRS signals at several wavelengths with the characteristicor property intended to be measured. All biological substances containthousands of C—H, O—H, and N—H bonds. Therefore, the exposure to nearinfrared radiation of a biological sample, such as a seed, results in acomplex spectrum which contains qualitative and quantitative informationabout the physical and chemical composition of that sample.

The numerical value of a specific analyte in the sample, such as proteincontent or oil content, is mediated by a calibration approach known aschemometrics. Chemometrics applies statistical methods such as multiplelinear regression (MLR), partial least squares (PLS), and principlecomponent analysis (PCA) to the spectral data and correlates them with aphysical property or other factor, that property or factor is directlydetermined rather than the analyte concentration itself. The methodfirst provides “wet chemistry” data of the samples required to developthe calibration.

Calibration of NIRS response was performed using data obtained by wetchemical analysis of a population of Arabidopsis ecotypes that wereexpected to represent diversity of oil and protein levels.

The exact oil composition of each ecotype used in the calibrationexperiment was performed using gravimetric analysis of oils extractedfrom seed samples (0.5 g or 1.0 g) by the accelerated solvent extractionmethod (ASE; Dionex Corp, Sunnyvale, Calif.). The extraction method wasvalidated against certified canola samples (Community Bureau ofReference, Belgium). Seed samples from each ecotype (0.5 g or 1 g) weresubjected to accelerated solvent extraction and the resulting extractedoil weights compared to the weight of oil recovered from canola seedthat has been certified for oil content (Community Bureau of Reference).The oil calibration equation was based on 57 samples with a range of oilcontents from 27.0% to 50.8%. To check the validity of the calibrationcurve, an additional set of samples was extracted by ASE and predictedusing the oil calibration equation. This validation set counted 46samples, ranging from 27.9% to 47.5% oil, and had a predicted standarderror of performance of 0.63%. The wet chemical method for protein waselemental analysis (% N×6.0) using the average of 3 representativesamples of 5 mg each validated against certified ground corn (NIST). Theinstrumentation was an Elementar Vario-EL III elemental analyzeroperated in CNS operating mode (Elementar Analysensysteme GmbH, Hanau,Germany).

The protein calibration equation was based on a library of 63 sampleswith a range of protein contents from 17.4% to 31.2%. An additional setof samples was analyzed for protein by elemental analysis (n=57) andscanned by NIRS in order to validate the protein prediction equation.The protein range of the validation set was from 16.8% to 31.2% and thestandard error of prediction was 0.468%.

NIRS analysis of Arabidopsis seed was carried out on between 40-300 mgexperimental sample. The oil and protein contents were predicted usingthe respective calibration equations.

Data obtained from NIRS analysis was analyzed statistically using anearest-neighbor (N-N) analysis. The N-N analysis allows removal ofwithin-block spatial variability in a fairly flexible fashion, whichdoes not require prior knowledge of the pattern of variability in thechamber. Ideally, all hybrids are grown under identical experimentalconditions within a block (rep). In reality, even in many block designs,significant within-block variability exists. Nearest-neighbor proceduresare based on assumption that environmental effect of a plot is closelyrelated to that of its neighbors. Nearest-neighbor methods useinformation from adjacent plots to adjust for within-block heterogeneityand so provide more precise estimates of treatment means anddifferences. If there is within-plot heterogeneity on a spatial scalethat is larger than a single plot and smaller than the entire block,then yields from adjacent plots will be positively correlated.Information from neighboring plots can be used to reduce or remove theunwanted effect of the spatial heterogeneity, and hence improve theestimate of the treatment effect. Data from neighboring plots can alsobe used to reduce the influence of competition between adjacent plots.The Papadakis N-N analysis can be used with designs to removewithin-block variability that would not be removed with the standardsplit plot analysis (Papadakis (1973) Inst. d'Amelior. PlantesThessaloniki (Greece) Bull. Scientif. No. 23; Papadakis (1984) Proc.Acad. Athens 59: 326-342.

Experiments were performed to identify those transformants or knockoutsthat exhibited modified sugar-sensing. For such studies, seeds fromtransformants were germinated on media containing 5% glucose or 9.4%sucrose which normally partially restrict hypocotyl elongation. Plantswith altered sugar sensing may have either longer or shorter hypocotylsthan normal plants when grown on this media. Additionally, other planttraits may be varied such as root mass.

Experiments may be performed to identify those transformants orknockouts that exhibited an improved pathogen tolerance. For suchstudies, the transformants are exposed to biotropic fungal pathogens,such as Erysiphe orontii, and necrotropic fungal pathogens, such asFusarium oxysporum. Fusarium oxysporum isolates cause vascular wilts anddamping off of various annual vegetables, perennials and weeds(Mauch-Mani and Slusarenko (1994) Molec Plant-Microbe Interact. 7:378-383). For Fusarium oxysporum experiments, plants are grown on Petridishes and sprayed with a fresh spore suspension of F. oxysporum. Thespore suspension is prepared as follows: A plug of fungal hyphae from aplate culture is placed on a fresh potato dextrose agar plate andallowed to spread for one week. Five ml sterile water is then added tothe plate, swirled, and pipetted into 50 ml Armstrong Fusarium medium.Spores are grown overnight in Fusarium medium and then sprayed ontoplants using a Preval paint sprayer. Plant tissue is harvested andfrozen in liquid nitrogen 48 hours post-infection.

Erysiphe orontii is a causal agent of powdery mildew. For Erysipheorontii experiments, plants are grown approximately 4 weeks in agreenhouse under 12 hour light (20° C., ˜30% relative humidity (rh)).Individual leaves are infected with E. orontii spores from infectedplants using a camel's hair brush, and the plants are transferred to aPercival growth chamber (20° C., 80% rh.). Plant tissue is harvested andfrozen in liquid nitrogen 7 days post-infection.

Botrytis cinerea is a necrotrophic pathogen. Botrytis cinerea is grownon potato dextrose agar under 12 hour light (20° C., ˜30% relativehumidity (rh)). A spore culture is made by spreading 10 ml of sterilewater on the fungus plate, swirling and transferring spores to 10 ml ofsterile water. The spore inoculum (approx. 105 spores/ml) is then usedto spray 10 day-old seedlings grown under sterile conditions on MS(minus sucrose) media. Symptoms are evaluated every day up toapproximately 1 week.

Sclerotinia sclerotiorum hyphal cultures are grown in potato dextrosebroth. One gram of hyphae is ground, filtered, spun down and resuspendedin sterile water. A 1:10 dilution is used to spray 10 day-old seedlingsgrown aseptically under a 12 hour light/dark regime on MS (minussucrose) media. Symptoms are evaluated every day up to approximately 1week.

Pseudomonas syringae pv maculicola (Psm) strain 4326 and pv maculicolastrain 4326 was inoculated by hand at two doses. Two inoculation dosesallows the differentiation between plants with enhanced susceptibilityand plants with enhanced resistance to the pathogen. Plants are grownfor 3 weeks in the greenhouse, then transferred to the growth chamberfor the remainder of their growth. Psm ES4326 may be hand inoculatedwith 1 ml syringe on 3 fully-expanded leaves per plant (4½ wk old),using at least 9 plants per overexpressing line at two inoculationdoses, OD=0.005 and OD=0.0005. Disease scoring is performed at day 3post-inoculation with pictures of the plants and leaves taken inparallel.

In some instances, expression patterns of the pathogen-induced genes(such as defense genes) may be monitored by microarray experiments. Inthese experiments, cDNAs are generated by PCR and resuspended at a finalconcentration of ˜100 ng/μl in 3×SSC or 150 mM Na-phosphate (Eisen andBrown (1999) Methods Enzymol. 303: 179-205). The cDNAs are spotted onmicroscope glass slides coated with polylysine. The prepared cDNAs arealiquoted into 384 well plates and spotted on the slides using, forexample, an x-y-z gantry (OmniGrid) which may be purchased fromGeneMachines (Menlo Park, Calif.) outfitted with quill type pins whichmay be purchased from Telechem International (Sunnyvale, Calif.). Afterspotting, the arrays are cured for a minimum of one week at roomtemperature, rehydrated and blocked following the protocol recommendedby Eisen and Brown (1999; supra).

Sample total RNA (10 μg) samples are labeled using fluorescent Cy3 andCy5 dyes. Labeled samples are resuspended in 4×SSC/0.03% SDS/4 μg salmonsperm DNA/2 μg tRNA/50 mM Na-pyrophosphate, heated for 95° C. for 2.5minutes, spun down and placed on the array. The array is then coveredwith a glass coverslip and placed in a sealed chamber. The chamber isthen kept in a water bath at 62° C. overnight. The arrays are washed asdescribed in Eisen and Brown (1999, supra) and scanned on a GeneralScanning 3000 laser scanner. The resulting files are subsequentlyquantified using IMAGENE, software (BioDiscovery, Los Angeles Calif.).

RT-PCR experiments may be performed to identify those genes inducedafter exposure to biotropic fungal pathogens, such as Erysiphe orontii,necrotropic fungal pathogens, such as Fusarium oxysporum, bacteria,viruses and salicylic acid, the latter being involved in a nonspecificresistance response in Arabidopsis thaliana. Generally, the geneexpression patterns from ground plant leaf tissue is examined.

Reverse transcriptase PCR was conducted using gene specific primerswithin the coding region for each sequence identified. The primers weredesigned near the 3′ region of each DNA binding sequence initiallyidentified.

Total RNA from these ground leaf tissues was isolated using the CTABextraction protocol. Once extracted total RNA was normalized inconcentration across all the tissue types to ensure that the PCRreaction for each tissue received the same amount of cDNA template usingthe 28S band as reference. Poly(A+) RNA was purified using a modifiedprotocol from the Qiagen OLIGOTEX purification kit batch protocol. cDNAwas synthesized using standard protocols. After the first strand cDNAsynthesis, primers for Actin 2 were used to normalize the concentrationof cDNA across the tissue types. Actin 2 is found to be constitutivelyexpressed in fairly equal levels across the tissue types beinginvestigated.

For RT PCR, cDNA template was mixed with corresponding primers and TaqDNA polymerase. Each reaction consisted of 0.2 μl cDNA template, 2 μl10× Tricine buffer, 2 μl 10× Tricine buffer and 16.8 μl water, 0.05 μlPrimer 1, 0.05 μl, Primer 2, 0.3 μl Taq DNA polymerase and 8.6 μl water.

The 96 well plate is covered with microfilm and set in the thermocyclerto start the reaction cycle. By way of illustration, the reaction cyclemay comprise the following steps:

STEP 1: 93° C. FOR 3 MIN;

Step 2: 93° C. for 30 sec;

Step 3: 65° C. for 1 min;

Step 4: 72° C. for 2 min;

Steps 2, 3 and 4 are repeated for 28 cycles;

Step 5: 72° C. for 5 min; and

Step 6 4° C.

To amplify more products, for example, to identify genes that have verylow expression, additional steps may be performed: The following methodillustrates a method that may be used in this regard. The PCR plate isplaced back in the thermocycler for 8 more cycles of steps 2-4.

Step 2 93° C. for 30 sec;

Step 3 65° C. for 1 min;

Step 4 72° C. for 2 min, repeated for 8 cycles; and

Step 5 4° C.

Eight microliters of PCR product and 1.5 μl of loading dye are loaded ona 1.2% agarose gel for analysis after 28 cycles and 36 cycles.Expression levels of specific transcripts are considered low if theywere only detectable after 36 cycles of PCR. Expression levels areconsidered medium or high depending on the levels of transcript comparedwith observed transcript levels for an internal control such as actin2.Transcript levels are determined in repeat experiments and compared totranscript levels in control (e.g., non-transformed) plants.

Example VIII Identification of Homologous Sequences

This example describes identification of genes that are orthologous toArabidopsis thaliana transcription factors from a computer homologysearch.

Homologous sequences, including those of paralogs and orthologs fromArabidopsis and other plant species, were identified using databasesequence search tools, such as the Basic Local Alignment Search Tool(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215: 403-410; and Altschulet al. (1997) Nucleic Acid Res. 25: 3389-3402). The tblastx sequenceanalysis programs were employed using the BLOSUM-62 scoring matrix(Henikoff and Henikoff (1992) Proc. Natl. Acad. Sci. 89: 10915-10919).The entire NCBI GenBank database was filtered for sequences from allplants except Arabidopsis thaliana by selecting all entries in the NCBIGenBank database associated with NCBI taxonomic ID 33090 (Viridiplantae;all plants) and excluding entries associated with taxonomic ID 3701(Arabidopsis thaliana).

These sequences are compared to sequences representing genes of SEQ IDNO: 2N−1, wherein N=1-229, using the Washington University TBLASTXalgorithm (version 2.0a19MP) at the default settings using gappedalignments with the filter “off”. For each gene of SEQ ID NO: 2N−1,wherein N=1-229, individual comparisons were ordered by probabilityscore (P-value), where the score reflects the probability that aparticular alignment occurred by chance. For example, a score of 3.6e-40is 3.6×10-40. In addition to P-values, comparisons were also scored bypercentage identity. Percentage identity reflects the degree to whichtwo segments of DNA or protein are identical over a particular length.Examples of sequences so identified are presented in Table 10 and Table12. Paralogous or orthologous sequences were readily identified andavailable in GenBank by Accession number (Table 10; Test sequence ID).The percent sequence identity among these sequences can be as low as47%, or even lower sequence identity.

Candidate paralogous sequences were identified among Arabidopsistranscription factors through alignment, identity, and phylogenicrelationships. A list of paralogs is shown in Table 12. Candidateorthologous sequences were identified from proprietary unigene sets ofplant gene sequences in Zea mays, Glycine max and Oryza sativa based onsignificant homology to Arabidopsis transcription factors. Thesecandidates were reciprocally compared to the set of Arabidopsistranscription factors. If the candidate showed maximal similarity in theprotein domain to the eliciting transcription factor or to a paralog ofthe eliciting transcription factor, then it was considered to be anortholog. Identified non-Arabidopsis sequences that were shown in thismanner to be orthologous to the Arabidopsis sequences are provided inTable 10.

Example IX Identification of Orthologous and Paralogous Sequences

Orthologs to Arabidopsis genes may identified by several methods,including experimental methods such as hybridization and/oramplification. This example describes how one may identify equivalogs tothe Arabidopsis AP2 family transcription factor CBF1 (polynucleotide SEQID NO: 1955, encoded polypeptide SEQ ID NO: 1956), which conferstolerance to abiotic stresses (Thomashow et al. (2002) U.S. Pat. No.6,417,428), and an example to confirm the function of homologoussequences. In this example, orthologs to CBF1 were found in canola(Brassica napus) using polymerase chain reaction (PCR).

Degenerate primers were designed for regions of AP2 binding domain andoutside of the AP2 (carboxyl terminal domain):

Mol 368 (reverse)

Mol 378 (forward)

(SEQ ID NO: 2205) 5′- CAY CCN ATH TAY MGN GGN GT -3′ (SEQ ID NO: 2206)5′- GGN ARN ARC ATN CCY TCN GCC -3′

-   -   (Y: C/T, N: A/C/G/T, H: A/C/T, M: A/C, R: A/G)

Primer Mol 368 is in the AP2 binding domain of CBF1 (amino acidsequence: His-Pro-Ile-Tyr-Arg-Gly-Val) while primer Mol 378 is outsidethe AP2 domain (carboxyl terminal domain) (amino acid sequence:Met-Ala-Glu-Gly-Met-Leu-Leu-Pro).

The genomic DNA isolated from B. napus was PCR-amplified by using theseprimers following these conditions: an initial denaturation step of 2min at 93° C.; 35 cycles of 93° C. for 1 min, 55° C. for 1 min, and 72°C. for 1 min; and a final incubation of 7 min at 72° C. at the end ofcycling.

The PCR products were separated by electrophoresis on a 1.2% agarose geland transferred to nylon membrane and hybridized with the AT CBF1 probeprepared from Arabidopsis genomic DNA by PCR amplification. Thehybridized products were visualized by colorimetric detection system(Boehringer Mannheim) and the corresponding bands from a similar agarosegel were isolated using the Qiagen Extraction Kit (Qiagen). The DNAfragments were ligated into the TA clone vector from TOPO TA Cloning Kit(Invitrogen) and transformed into E. coli strain TOP10 (Invitrogen).

Seven colonies were picked and the inserts were sequenced on an ABI 377machine from both strands of sense and antisense after plasmid DNAisolation. The DNA sequence was edited by sequencer and aligned with theAtCBF1 by GCG software and NCBI blast searching.

The nucleic acid sequence and amino acid sequence of one canola orthologfound in this manner (bnCBF1; polynucleotide SEQ ID NO: 2203 andpolypeptide SEQ ID NO: 2204) identified by this process is shown in theSequence Listing.

The aligned amino acid sequences show that the bnCBF1 gene has 88%identity with the Arabidopsis sequence in the AP2 domain region and 85%identity with the Arabidopsis sequence outside the AP2 domain whenaligned for two insertion sequences that are outside the AP2 domain.

Similarly, paralogous sequences to Arabidopsis genes, such as CBF1, mayalso be identified.

Two paralogs of CBF1 from Arabidopsis thaliana: CBF2 and CBF3. CBF2 andCBF3 have been cloned and sequenced as described below. The sequences ofthe DNA SEQ ID NO: 1957 and 1959 and encoded proteins SEQ ID NO: 1958and 1960 are set forth in the Sequence Listing.

A lambda cDNA library prepared from RNA isolated from Arabidopsisthaliana ecotype Columbia (Lin and Thomashow (1992) Plant Physiol. 99:519-525) was screened for recombinant clones that carried insertsrelated to the CBF1 gene (Stockinger et al. (1997) Proc. Natl. Acad.Sci. 94:1035-1040). CBF1 was ³²P-radiolabeled by random priming(Sambrook et al. (1998) supra) and used to screen the library by theplaque-lift technique using standard stringent hybridization and washconditions (Hajela et al. (1990) Plant Physiol. 93:1246-1252; Sambrooket al. (1998) supra) 6×SSPE buffer, 60° C. for hybridization and0.1×SSPE buffer and 60° C. for washes). Twelve positively hybridizingclones were obtained and the DNA sequences of the cDNA inserts weredetermined. The results indicated that the clones fell into threeclasses. One class carried inserts corresponding to CBF1. The two otherclasses carried sequences corresponding to two different homologs ofCBF1, designated CBF2 and CBF3. The nucleic acid sequences and predictedprotein coding sequences for Arabidopsis CBF1, CBF2 and CBF3 are listedin the Sequence Listing (SEQ ID NOs: 1955, 1957, 1959 and SEQ ID NOs:1956, 1958, 1960, respectively). The nucleic acid sequences andpredicted protein coding sequence for Brassica napus CBF ortholog islisted in the Sequence Listing (SEQ ID NOs: 2203 and 2204,respectively).

A comparison of the nucleic acid sequences of Arabidopsis CBF1, CBF2 andCBF3 indicate that they are 83 to 85% identical as shown in Table 14.

TABLE 14 Percent identity^(a) DNA^(b) Polypeptide cbf1/cbf2 85 86cbf1/cbf3 83 84 cbf2/cbf3 84 85 ^(a)Percent identity was determinedusing the Clustal algorithm from the Megalign program (DNASTAR, Inc.).^(b)Comparisons of the nucleic acid sequences of the open reading framesare shown.

Similarly, the amino acid sequences of the three CBF polypeptides rangefrom 84 to 86% identity. An alignment of the three amino acidicsequences reveals that most of the differences in amino acid sequenceoccur in the acidic C-terminal half of the polypeptide. This region ofCBF1 serves as an activation domain in both yeast and Arabidopsis (notshown).

Residues 47 to 106 of CBF1 correspond to the AP2 domain of the protein,a DNA binding motif that to date, has only been found in plant proteins.A comparison of the AP2 domains of CBF1, CBF2 and CBF3 indicates thatthere are a few differences in amino acid sequence. These differences inamino acid sequence might have an effect on DNA binding specificity.

Example X Screen of Plant cDNA Library for Sequence Encoding aTranscription Factor DNA Binding Domain that Binds to a TranscriptionFactor Binding Promoter Element and Demonstration of ProteinTranscription Regulation Activity

The “one-hybrid” strategy (Li and Herskowitz (1993) Science 262:1870-1874) is used to screen for plant cDNA clones encoding apolypeptide comprising a transcription factor DNA binding domain, aconserved domain. In brief, yeast strains are constructed that contain alacZ reporter gene with either wild-type or mutant transcription factorbinding promoter element sequences in place of the normal UAS (upstreamactivator sequence) of the GALL promoter. Yeast reporter strains areconstructed that carry transcription factor binding promoter elementsequences as UAS elements are operably linked upstream (5′) of a lacZreporter gene with a minimal GAL1 promoter. The strains are transformedwith a plant expression library that contains random cDNA inserts fusedto the GAL4 activation domain (GAL4-ACT) and screened for blue colonyformation on X-gal-treated filters (X-gal:5-bromo-4-chloro-3-indolyl-β-D-galactoside; Invitrogen Corporation,Carlsbad Calif.). Alternatively, the strains are transformed with a cDNApolynucleotide encoding a known transcription factor DNA binding domainpolypeptide sequence.

Yeast strains carrying these reporter constructs produce low levels ofbeta-galactosidase and form white colonies on filters containing X-gal.The reporter strains carrying wild-type transcription factor bindingpromoter element sequences are transformed with a polynucleotide thatencodes a polypeptide comprising a plant transcription factor DNAbinding domain operably linked to the acidic activator domain of theyeast GAL4 transcription factor, “GAL4-ACT”. The clones that contain apolynucleotide encoding a transcription factor DNA binding domainoperably linked to GLA4-ACT can bind upstream of the lacZ reporter genescarrying the wild-type transcription factor binding promoter elementsequence, activate transcription of the lacZ gene and result in yeastforming blue colonies on X-gal-treated filters.

Upon screening about 2×10⁶ yeast transformants, positive cDNA clones areisolated; i.e., clones that cause yeast strains carrying lacZ reportersoperably linked to wild-type transcription factor binding promoterelements to form blue colonies on X-gal-treated filters. The cDNA clonesdo not cause a yeast strain carrying a mutant type transcription factorbinding promoter elements fused to LacZ to turn blue. Thus, apolynucleotide encoding transcription factor DNA binding domain, aconserved domain, is shown to activate transcription of a gene.

Example XI Gel Shift Assays

The presence of a transcription factor comprising a DNA binding domainwhich binds to a DNA transcription factor binding element is evaluatedusing the following gel shift assay. The transcription factor isrecombinantly expressed and isolated from E. coli or isolated from plantmaterial. Total soluble protein, including transcription factor, (40 ng)is incubated at room temperature in 10 μl of 1× binding buffer (15 mMHEPES (pH 7.9), 1 mM EDTA, 30 mM KCl, 5% glycerol, 5% bovine serumalbumin, 1 mM DTD) plus 50 ng poly(dl-dC):poly(dl-dC) (Pharmacia,Piscataway N.J.) with or without 100 ng competitor DNA. After 10 minutesincubation, probe DNA comprising a DNA transcription factor bindingelement (1 ng) that has been ³²P-labeled by end-filling (Sambrook et al.(1989) supra) is added and the mixture incubated for an additional 10minutes. Samples are loaded onto polyacrylamide gels (4% w/v) andfractionated by electrophoresis at 150V for 2 h (Sambrook et al. (1998)supra). The degree of transcription factor-probe DNA binding isvisualized using autoradiography. Probes and competitor DNAs areprepared from oligonucleotide inserts ligated into the BamHI site ofpUC118 (Vieira et al. (1987) Methods Enzymol. 153: 3-11). Orientationand concatenation number of the inserts are determined by dideoxy DNAsequence analysis (Sambrook et al. (1998) supra). Inserts are recoveredafter restriction digestion with EcoRI and HindIII and fractionation onpolyacrylamide gels (12% w/v) (Sambrook et al. (1998) supra).

Example XII Introduction of Polynucleotides into Dicotyledonous Plants

Transcription factor sequences listed in the Sequence Listing recombinedinto pMEN20 or pMEN65 expression vectors are transformed into a plantfor the purpose of modifying plant traits. The cloning vector may beintroduced into a variety of cereal plants by means well known in theart such as, for example, direct DNA transfer or Agrobacteriumtumefaciens-mediated transformation. It is now routine to producetransgenic plants using most dicot plants (see Weissbach and Weissbach,(1989) supra; Gelvin et al. (1990) supra; Herrera-Estrella et al. (1983)supra; Bevan (1984) supra; and Klee (1985) supra). Methods for analysisof traits are routine in the art and examples are disclosed above.

Example XIII Examples of Genes that Confer Significant Improvements toPlants

Experiments were performed to identify those transformants or knockoutsthat exhibited a morphological difference relative to wild-type controlplants, i.e., a modified structure and/or development characteristics.For such studies, the transformants were observed by eye to identifynovel structural or developmental characteristics associated with theectopic expression of the polynucleotides or polypeptides of theinvention. Examples of genes and equivalogs that confer significantimprovements to overexpressing plants are noted below. Experimentalobservations made with regard to specific genes whose expression hasbeen modified in overexpressing plants, and potential applications basedon these observations, are also presented.

Modified phenotypes observed for particular overexpressor or knockoutplants are provided in Table 7. For a particular overexpressor thatshows a less beneficial characteristic, it may be more useful to selecta plant with a decreased expression of the particular transcriptionfactor. For a particular knockout that shows a less beneficialcharacteristic, it may be more useful to select a plant with anincreased expression of the particular transcription factor.

Table 7 provides exemplary polynucleotide and polypeptide sequences ofthe invention. The sequences of the Sequence Listing or those in Tables7-11, or those disclosed here, can be used to prepare transgenic plantsand plants with altered traits. The specific transgenic plants listedbelow are produced from the sequences of the Sequence Listing as noted.A number of genes and homologs that confer significant improvements toknockout or overexpressing plants were noted below. Experimentalobservations made with regard to specific genes whose expression wasmodified in overexpressing or knockout plants, and potentialapplications based on these observations, were also presented. As notedin the results of the plate-based physiology assays presented in thetables of this Example, a representative number of sequences fromdiverse plant species conferred various levels of increased stresstolerance in a range of abiotic stress assays, as noted below. Observedeffects of overexpression on flowering time are also noted in the textbelow. These comparable effects indicate that sequences found withinspecific clades or subclades are functionally related and can be used toconfer various types of abiotic stress tolerance in plants. A number ofthese genes concurrently confer tolerance to multiple abiotic stresses.

Salt stress assays are intended to find genes that confer bettergermination, seedling vigor or growth in high salt. Evaporation from thesoil surface causes upward water movement and salt accumulation in theupper soil layer where the seeds are placed. Thus, germination normallytakes place at a salt concentration much higher than the mean saltconcentration of in the whole soil profile. Plants differ in theirtolerance to NaCl depending on their stage of development, thereforeseed germination, seedling vigor, and plant growth responses areevaluated.

Osmotic stress assays (including NaCl and mannitol assays) are intendedto determine if an osmotic stress phenotype is NaCl-specific or if it isa general osmotic stress related phenotype. Plants tolerant to osmoticstress could also have more tolerance to drought and/or freezing.

Drought assays are intended to find genes that mediate better plantsurvival after short-term, severe water deprivation. Ion leakage will bemeasured if needed. Osmotic stress tolerance would also support adrought tolerant phenotype.

Temperature stress assays are intended to find genes that confer bettergermination, seedling vigor or plant growth under temperature stress(cold, freezing and heat).

Sugar sensing assays are intended to find genes involved in sugarsensing by germinating seeds on high concentrations of sucrose andglucose and looking for degrees of hypocotyl elongation. The germinationassay on mannitol controls for responses related to osmotic stress.Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, flowering, senescence,sugar metabolism and photosynthesis. Sucrose is the major transport formof photosynthate and its flux through cells has been shown to affectgene expression and alter storage compound accumulation in seeds(source-sink relationships). Glucose-specific hexose-sensing has alsobeen described in plants and is implicated in cell division andrepression of “famine” genes (photosynthetic or glyoxylate cycles).

Germination assays followed modifications of the same basic protocol.Sterile seeds were sown on the conditional media listed below. Plateswere incubated at 22° C. under 24-hour light (120-130 μEin/m²/s) in agrowth chamber. Evaluation of germination and seedling vigor wasconducted 3 to 15 days after planting. The basal media was 80%Murashige-Skoog medium (MS)+vitamins.

For salt and osmotic stress germination experiments, the medium wassupplemented with 150 mM NaCl or 300 mM mannitol. Growth regulatorsensitivity assays were performed in MS media, vitamins, and either 0.3μM ABA, 9.4% sucrose, or 5% glucose.

Temperature stress cold germination experiments were carried out at 8°C. Heat stress germination experiments were conducted at 32° C. to 37°C. for 6 hours of exposure.

For stress experiments conducted with more mature plants, seeds weregerminated and grown for seven days on MS+vitamins+1% sucrose at 22° C.and then transferred to chilling and heat stress conditions. The plantswere either exposed to chilling stress (6 hour exposure to 4-8° C.), orheat stress (32° C. was applied for five days, after which the plantswere transferred back 22° C. for recovery and evaluated after 5 daysrelative to controls not exposed to the depressed or elevatedtemperature).

Soil-based drought screens were performed with Arabidopsis plantsoverexpressing the transcription factors listed in the Sequence Listing,where noted below. Seeds from wild-type Arabidopsis plants, or plantsoverexpressing a polypeptide of the invention, were stratified for threedays at 4° C. in 0.1% agarose. Fourteen seeds of each overexpressor orwild-type were then sown in three inch clay pots containing a 50:50 mixof vermiculite:perlite topped with a small layer of MetroMix 200 andgrown for fifteen days under 24 hr light. Pots containing wild-type andoverexpressing seedlings were placed in flats in random order. Droughtstress was initiated by placing pots on absorbent paper for seven toeight days. The seedlings were considered to be sufficiently stressedwhen the majority of the pots containing wild-type seedlings within aflat had become severely wilted. Pots were then re-watered and survivalwas scored four to seven days later. Plants were ranked againstwild-type controls for each of two criteria: tolerance to the droughtconditions and recovery (survival) following re-watering

At the end of the initial drought period, each pot was assigned anumeric value score depending on the above criteria. A low value wasassigned to plants with an extremely poor appearance (i.e., the plantswere uniformly brown) and a high value given to plants that were ratedvery healthy in appearance (i.e., the plants were all green). After theplants were rewatered and incubated an additional four to seven days,the plants were reevaluated to indicate the degree of recovery from thewater deprivation treatment.

An analysis was then conducted to determine which plants best survivedwater deprivation, identifying the transgenes that consistentlyconferred drought-tolerant phenotypes and their ability to recover fromthis treatment. The analysis was performed by comparing overall andwithin-flat tabulations with a set of statistical models to account forvariations between batches. Several measures of survival were tabulated,including: (a) the average proportion of plants surviving relative towild-type survival within the same flat; (b) the median proportionsurviving relative to wild-type survival within the same flat; (c) theoverall average survival (taken over all batches, flats, and pots); (d)the overall average survival relative to the overall wild-type survival;and (e) the average visual score of plant health before rewatering.

Flowering time was measured by the number of rosette leaves present whena visible inflorescence of approximately 3 cm is apparent. Rosette andtotal leaf number on the progeny stem are tightly correlated with thetiming of flowering (Koornneef et al. (1991) Mol. Gen. Genet. 229:57-66). The vernalization response was also measured. For vernalizationtreatments, seeds were sown to MS agar plates, sealed with microporetape, and placed in a 4° C. cold room with low light levels for 6-8weeks. The plates were then transferred to the growth rooms alongsideplates containing freshly sown non-vernalized controls. Rosette leaveswere counted when a visible inflorescence of approximately 3 cm wasapparent.

Results:

Empty cells found in the tables in this Example indicate that resultshave not yet been obtained for that particular stress experiment.Abbreviations used in the tables in this example include:

-   -   (++) Substantially enhanced performance compared to controls.        The phenotype was very consistent and growth was much greater        than the normal levels of variability observed for that assay.    -   (+) Enhanced performance compared to controls. The response was        consistently above the normal levels of variability observed for        that assay.    -   (wt) response similar to wild-type; and    -   (germ.) germination.        G9 (SEQ ID NO: 1949 and 1950)

G9 is a putative paralog of G867, and has been referenced in the publicliterature as RAP2.8 and RAV2 (Okamuro et al. (1997) Proc. Nat. Acad.Sci. USA 94: 7076-7081; Kagaya et al. (1999) Nucleic Acids Res. 27:470-478). No genetic analysis of the locus has yet been published.

Experimental Observations

We have previously observed that G9 overexpression enhanced root growth.The aim of the present study was to re-assess 35S::G9 lines anddetermine whether overexpression of the gene could confer enhancedstress tolerance in a comparable manner to G867. We also sought to testwhether use of a two-component overexpression system would produce anystrengthening of the phenotype relative to the use of a 35S directpromoter-fusion.

New overexpression lines have been obtained using both a direct promoterfusion construct and a two component expression system. Lines generatedby either of these methods exhibited similar phenotypes and displayed anumber of morphological effects that had not been observed during ourearlier genomics screens. These included a reduction in overall size,alterations in leaf orientation (which potentially indicated adisruption in circadian control), slow growth, and floral abnormalitiesrelative to controls.

We have tested the 35S::G9 two-component and direct-fusion lines under avariety of plate based treatments. All of the direct fusion lines andmost of the two-component lines out-performed controls in a germinationassay on sodium chloride plates. In addition, many of these lines alsoshowed positive phenotypes when germinated on sucrose, and ABA, as wellas in a growth assay under cold conditions.

It should be emphasized that we have obtained comparable developmentaleffects as well as a strong enhancement of drought related stresstolerance in 35S::G867 lines and in overexpression lines for the othertwo putative paralogs, G1930 and G993. The almost identical phenotypiceffects observed for the four genes strongly suggest that they arefunctionally equivalent.

TABLE 15 G9 35S, Direct promoter-fusion and 2-components-suppTfn Germ.Germ. in in high high Germ Germ Growth Line Transformation NaCl mannitolSucrose ABA in heat in cold in heat Drought Chilling 302 Direct- + wt +wt wt wt wt wt + fusion 304 Direct- + wt wt wt wt wt wt wt wt fusion 305Direct- ++ wt + wt wt wt wt wt + fusion 306 Direct- + wt wt wt wt wt wtwt wt fusion 307 Direct- + wt wt wt wt wt wt wt wt fusion 310 Direct- +wt + ++ wt wt wt wt + fusion 311 Direct- ++ wt + wt wt wt wt wt + fusion312 Direct- ++ wt wt + wt wt wt wt + fusion 313 Direct- + wt + + wt wtwt wt + fusion 318 Direct- ++ wt + wt wt wt wt wt wt fusion 483 2-compwt wt + + wt 485 2-comp + wt wt wt wt 486 2-comp wt wt wt wt wt 4882-comp + wt + ++ wt 489 2-comp + wt wt ++ wt 490 2-comp wt wt + ++ wt491 2-comp wt wt + ++ wt 493 2-comp wt wt + ++ wt 494 2-comp + wt wt ++wt 498 2-comp + wt + ++ wtPotential Applications

Based on the results of these abiotic stress assays, G867 and relatedgenes such as G9 are excellent candidate genes for improvement ofdrought and cold-related stress tolerance in commercial species. Themorphological effects associated with their overexpression suggests thattissue-specific or conditional promoters might be used to optimize theutility of these genes.

G19 (SEQ ID NO: 3 and 5)

Published Information

G19 belongs to the EREBP subfamily of transcription factors and containsonly one AP2 domain. G19 corresponds to the previously described geneRAP2.3 (Okamuro et al. (1997) Proc. Natl. Acad. Sci. 94:7076-7081).Close inspection of the Arabidopsis cDNA sequences of RAP2.3 (AF003096;Okamuro et al. (1997) supra), AtEBP (Y09942; Buttner et al. (1997) Proc.Natl. Acad. Sci. 94:5961-5966), and ATCADINP (Z37504) suggests that theymay correspond to the same gene (Riechmann et al. (1998) Biol. Chem.379:633-646). G19/RAP2.3 is ubiquitously expressed (Okamuro et al.(1997) supra). AtEBP was isolated by virtue of the protein-proteininteraction between AtEBP and OBF4, a basic-region leucine zippertranscription factor (Buttner et al. (1997) supra). AtEBP expressionlevels in seedlings were increased after treatment with ethylene(ethephon) (Buttner et al. (1997) supra). AtEBP was found to bind toGCC-box containing sequences, like that of the PRB-1b promoter (Buttneret al. (1997) supra). It has been suggested that the interaction betweenAtEBP and OBF4 reflects cross-coupling between EREBP and bZIPtranscription factors that might be important in regulating geneexpression during the plant defense response (Buttner et al. (1997)supra).

Experimental Observations

Transgenic plants in which G19 is expressed under the control of the 35Spromoter were morphologically similar to control plants. G19 isconstitutively expressed in the different tissues examined; however G19expression was significantly repressed by methyl jasmonate (MeJ) andinduced by ACC (this latter result correlates with the previouslydescribed increase in G19 expression levels in seedlings after treatmentwith ethylene (ethephon); Buttner et al. (1997) supra). G19 wassignificantly induced upon infection by the fungal pathogen Erysipheorontii. In addition, G19 overexpressing plants were more tolerant toinfection with a moderate dose of Erysiphe orontii. G19 overexpressingplants were also tested for their tolerance to two other pathogens, thenecrotrophic fungal pathogen Fusarium oxysporum and the bacterialpathogen Pseudomonas syringae; the transgenic plants were not found tohave altered susceptibility to the pathogens.

Both the jasmonic acid and the ethylene signal transduction pathwayswere involved in the regulation of the defense response and the woundresponse, and the two pathways have been found to interactsynergistically. The regulation of G19 expression by both hormones, itsinduction upon Erysiphe orontii infection, as well as the preliminarydata indicating that increased tolerance to that pathogen was conferredby G19 overexpression, suggest that G19 plays a role in the control ofthe defense and/or wound response. It would be of interest to test G19overexpressing plants in insect-plant interaction experiments. Theincrease in tolerance to Erysiphe orontii that is conferred by G19overexpression can be tested using other races of the pathogen. It wouldalso be of interest to test other pathogens in addition to Erysipheorontii, Fusarium oxysporum, and Pseudomonas syringae.

Since G19 was expressed at significant levels in a constitutive fashion,similar experiments to those described here can be performed with G19knockout mutant plants to further elucidate the function of this gene.

Potential Applications

G19 or its equivalogs can be used to manipulate the plant defense-wound-or insect-response, as well as the jasmonic acid and ethylene signaltransduction pathways themselves.

G22 (SEQ ID NO: 5 and 6)

Published Information

G22 was identified in the sequence of BAC T13E15 (gene T13E15.5) by TheInstitute of Genomic Research (TIGR) as a “TINY transcription factorisolog”. G22 belongs to the EREBP subfamily and contains only one AP2domain, and phylogenetic analyses place G22 relatively close to otherEREBP subfamily genes, such as, TINY and ATDL4400C (Riechmann et al.(1998) Biol. Chem. 379:633-646).

Experimental Observations

G22 was constitutively expressed at medium levels. There appeared to beno phenotypic alteration on plant morphology upon G22 overexpression.Plants ectopically overexpressing G22 were more tolerant to high NaClcontaining media in a root growth assay compared with wild-typecontrols.

Potential Applications

G22 or its equivalogs can be used to increase plant tolerance to soilsalinity during germination, at the seedling stage, or throughout theplant life cycle. Salt tolerance is a particularly desirable phenotypeduring the germination stage of a crop plant, which would impactsurvivability and yield.

G28 (SEQ ID NO: 9 and 10)

Published Information

G28 corresponds to AtERF1 (GenBank accession number AB008103) (Fujimotoet al. (2000) Plant Cell 12:393-404). G28 appears as gene AT4g17500 inthe annotated sequence of Arabidopsis chromosome 4 (AL161546.2).

AtERF1 has been shown to have GCC-box binding activity [somedefense-related genes that were induced by ethylene were found tocontain a short cis-acting element known as the GCC-box: AGCCGCC (Ohmeet al. (1990) Plant Mol. Biol. 15:941-946)]. Using transient assays inArabidopsis leaves, AtERF1 was found to be able to act as a GCC-boxsequence specific transactivator (Fujimoto et al. (2000) supra).

AtERF1 expression has been described to be induced by ethylene (two- tothree-fold increase in AtERF1 transcript levels 12 h after ethylenetreatment) (Fujimoto et al. (2000) supra). In the ein2 mutant, theexpression of AtERF1 was not induced by ethylene, suggesting that theethylene induction of AtERF1 is regulated under the ethylene signalingpathway (Fujimoto et al. (2000) supra). AtERF1 expression was alsoinduced by wounding, but not by other abiotic stresses (such as cold,salinity, or drought) (Fujimoto et al. (2000) supra).

It has been suggested that AtERFs, in general, may act as transcriptionfactors for stress-responsive genes, and that the GCC-box may act as acis-regulatory element for biotic and abiotic stress signal transductionin addition to its role as an ethylene responsive element (ERE)(Fujimoto et al. (2000) supra), but there is no data available on thephysiological functions of AtERF1.

Experimental Observations

The function of G28 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. G28overexpressing lines were more tolerant to infection with a moderatedose of the fungal pathogen Erysiphe orontii. G28 overexpression did notseem to have detrimental effects on plant growth or vigor, since plantsfrom most of the lines were morphologically wild-type. In addition, nodifference was detected between those lines and the correspondingwild-type controls in all the biochemical assays that were performed.

G28 was ubiquitously expressed. G28 overexpressing lines were also moretolerant to Sclerotinia sclerotiorum and Botrytis cinerea. In a repeatexperiment using individual lines, all three lines analyzed showedtolerance to S. sclerotiorum, and two of the three lines tested weremore tolerant to B cinerea.

Potential Applications

G28 transgenic plants had an altered response to fungal pathogens, inthat those plants were more tolerant to the pathogens. Therefore, G28 orits equivalogs can be used to manipulate the defense response in orderto generate pathogen-resistant plants.

G47 (SEQ ID NO: 11 and 12)

Published Information

G47 corresponds to gene T22J18.2 (AAC25505).

Experimental Observations

G47 expression levels can be altered by environmental conditions, inparticular reduced by salt and osmotic stresses.

The function of G47 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. 35S::G47plants showed enhanced tolerance to osmotic stress. In a root growthassay on PEG containing media, G47 overexpressing transgenic seedlingswere larger and had more root growth compared with the wild-typecontrols. G47 overexpressors also were significantly more droughttolerant than wild-type control plants in a soil-based assay.

Overexpression of G47 also produced a substantial delay in floweringtime and caused a marked change in shoot architecture. 35S::G47transformants were small at early stages and switched to flowering morethan a week later than wild-type controls (continuous light conditions).The inflorescences from these plants appeared thick and fleshy, hadreduced apical dominance, and exhibited reduced internode elongationleading to a short compact stature. The branching pattern of the stemsalso appeared abnormal, with the primary shoot becoming “kinked” at eachcoflorescence node. Additionally, the plants showed reduced fertilityand formed rather small siliques that were borne on short pedicels andheld vertically, close against the stem.

Additional alterations were detected in the inflorescence stems of35S::G47 plants. Stem sections from T2-21 and T2-24 plants were of widerdiameter, and had large irregular vascular bundles containing a muchgreater number of xylem vessels than wild type. Furthermore, some of thexylem vessels within the bundles appeared narrow and were possibly morelignified than were those of controls.

G47 was expressed at higher levels in rosette leaves, and transcriptswere detected in other tissues (flower, embryo, silique, and germinatingseedling), but not in roots.

Potential Applications

G47 or its equivalogs can be used to manipulate flowering time, tomodify plant architecture and stem structure (including development ofvascular tissues and lignin content) and to improve plant performanceunder osmotic stress and drought conditions.

Transcription factor equivalogs that modulate lignin content can bevaluable. This modulation can allow the quality of wood used forfurniture or construction to be improved. Lignin is energy rich;increasing lignin composition is valuable in raising the energy contentof wood used for fuel. Conversely, the pulp and paper industries seekwood with a reduced lignin content. Currently, lignin must be removed ina costly process that involves the use of many polluting chemicals.Consequently, lignin is a serious barrier to efficient pulp and paperproduction. In addition to forest biotechnology applications, changinglignin content can increase the palatability of various fruits andvegetables.

G226 (SEQ ID NO: 37 and 38)

Published Information

G226 was identified from the Arabidopsis BAC sequence, AC002338, basedon its sequence similarity within the conserved domain to other Mybfamily members in Arabidopsis. To date, there is no publishedinformation regarding the function of this gene.

Experimental Observations

The function of G226 was analyzed through its ectopic overexpression inplants. G226 overexpressors were more tolerant to low nitrogen and highsalt stress. They showed more root growth and possibly more root hairsunder conditions of nitrogen limitation compared with wild-typecontrols. Many plants were glabrous and lacked anthocyanin productionwhen under stress such as growth conditions of low nitrogen and highsalt. Several G226 overexpressors were glabrous and produce lessanthocyanin under stress; these effects might be due to binding sitecompetition with other Myb family transcription factors involved inthese functions and not directly related to the primary function of thisgene.

Results from the biochemical analysis of G226 overexpressors suggestedthat one line had higher amounts of seed protein, which could have beena result of increased nitrogen uptake by these plants.

A microarray experiment was done on a separate G226 overexpressing line.The G226 sequence itself was overexpressed 16-fold above wild type,however, very few changes in other gene expression were observed in thisline. On the array, a chlorate/nitrate transporter DNA sequence wasinduced 2.7-fold over wild type, which could explain the low nitrogentolerant phenotype of the plants and the increased amounts of seedprotein in one of the lines. The same DNA sequence was present severaltimes on the array and in all cases the DNA sequence showed induction,adding more validity to the data. Five other genes/DNA sequences inducedbut had unknown function. A methyltransferase, a pollen-specificprotein, and a zinc binding peroxisomal membrane protein encodingsequences were also induced, however their role in regard to thephenotype of the plants is not known.

TABLE 16 G226 35S, 2-components supTfn Germ. Germ. in G682- in high highGerm Germ Growth like root Line NaCl mannitol Sucrose ABA in heat incold in heat Drought Chilling morph. 308 wt wt ++ + wt wt wt wt wt + 309wt wt wt + wt + wt wt + + 313 wt wt ++ ++ wt + wt wt + + 316 wt wt ++ ++wt wt wt wt wt + 318 wt wt wt ++ wt wt wt wt wt + 381 wt wt + wt wt wtwt wt wt + 383 wt wt + + wt wt wt wt wt + 583 wt wt wt + wt wt wt wt +585 wt wt wt + wt wt wt wt wt +Potential Applications

The results of these abiotic stress tolerance assays indicate that G682and related genes such as G226 may be used to improve drought andcold-related stress tolerance in plants.

The utilities of a gene or its equivalogs conferring tolerance toconditions of low nitrogen include: (1) Cost savings to the farmer byreducing the amounts of fertilizer needed; (2) Environmental benefits ofreduced fertilizer runoff; (3) Improved yield and stress tolerance. Inaddition, G226 can be used to increase seed protein amounts and/orcomposition, which may impact yield as well as the nutritional value andproduction of various food products.

G682 and related genes such as G226 can be used to alter trichome numberand distribution in plants. Trichome glands on the surface of manyhigher plants produce and secrete exudates, which give protection fromthe elements and pests such as insects, microbes and herbivores. Theseexudates may physically immobilize insects and spores, may beinsecticidal or antimicrobial or they may allergens or irritants toprotect against herbivores. It has also been suggested that trichomesmay decrease transpiration by decreasing leaf surface airflow, and byexuding chemicals that protect the leaf from the sun.

The reduction in size that was apparent in these lines suggests that theutility of G226 might be optimized by use of different promoters orprotein modifications.

G353 (SEQ ID NO: 59 and 60)

Published Information

G353 was identified in the sequence of P1 clone MMN10, GenBank accessionnumber AB0154751, released by the Arabidopsis Genome Initiative. G353corresponds to RHL41 (Kazuoka et al. (2000) Plant J. 24:191-203) andZat12 (Meissner et al. (1997) Plant Mol. Biol. 33:615-624). TransgenicArabidopsis plants over-expressing the RHL41 gene showed an increasedtolerance to high-intensity light, and also morphological changes ofthicker and dark green leaves. The palisade parenchyma was highlydeveloped in the leaves of the transgenic plants. Anthocyanin content,as well as the chlorophyll content, also increased. Antisense transgenicplants exhibited decreased tolerance to high irradiation. RHL41 proteinmay play a key role in the acclimatization response to changes in lightintensity.

Experimental Observations

G353 was uniformly expressed in all tissues and under all conditionstested in RT-PCR experiments. The highest level of expression wasobserved in rosette leaves, embryos, and siliques.

The function of this gene was analyzed using transgenic plants in whichG353 was expressed under the control of the 35S promoter. Overexpressionof G353 in resulted in enhanced tolerance to osmotic stress and droughttolerance in a soil-based assay.

Overexpression also affected flower morphology to a significant degree.35S::G353 plants had a reduction in flower pedicel length, and downwardpointing siliques. This phenotype was very similar to that described forthe brevipedicellus (bp) mutant (Koornneef et al. (1983) J. Hered.74:265-272) and in overexpression of a related gene, G354. Othermorphological changes in shoots were also observed in 35S::G353 plants.Leaves had short petioles, were rather flat, rounded, and sometimesshowed changes in coloration. These effects were observed in varyingdegrees in the majority of transformants. Severely affected plants weretiny, had contorted leaves, poor fertility, and produced few seeds.Overexpression of G353 in Arabidopsis resulted in an increase in seedglucosinolate M39494 in two T2 lines.

Potential Applications

G353 or its equivalogs can be used to alter inflorescence structure,which may have value in production of novel ornamental plants.

G353 or its equivalogs can be used to alter a plant's response to waterdeficit conditions and, therefore, be used to engineer plants withenhanced tolerance to drought, salt stress, and freezing.

Increases or decreases in specific glucosinolates or total glucosinolatecontent may be desirable depending upon the particular application. Forexample: (1) Glucosinolates are undesirable components of the oilseedsused in animal feed, since they produce toxic effects. Low-glucosinolatevarieties of canola have been developed to combat this problem; (2) Someglucosinolates have anti-cancer activity; thus, increasing the levels orcomposition of these compounds might be of interest from a nutraceuticalstandpoint; (3) Glucosinolates form part of a plants natural defenseagainst insects; modification of glucosinolate composition or quantitycould therefore afford increased protection from predators; furthermore,in edible crops, tissue specific promoters might be used to ensure thatthese compounds accumulate specifically in tissues, such as theepidermis, which are not taken for consumption.

G354 (SEQ ID NO: 61 and 62)

Published Information

G354 was identified in the sequence of BAC clone F12M12, GenBankaccession number AL355775, released by the Arabidopsis GenomeInitiative. G354 corresponds to ZAT7 (Meissner et al. Plant Mol. Biol.33:615-624).

Experimental Observations

Greatest levels of expression of G354 were observed in rosette leaves,embryos, and siliques. Some expression of G354 was also observed inflowers.

The function of this gene was analyzed using transgenic plants in whichG353 was overexpressed under the control of the 35S promoter. 35S::G354plants had a reduction in flower pedicel length, and downward pointingsiliques. This phenotype was very similar to that described for thebrevipedicellus (bp) mutant Koornneef et al. (1983) J. Hered.74:265-272) and in overexpression of a related gene, G353. Othermorphological changes in shoots were also observed in 35S::G354 plants.Many 35S::G354 seedlings had abnormal cotyledons, elongated, thickenedhypocotyls, and short roots. The majority of T1 plants had a veryextreme phenotype, were tiny, and arrested development without forminginflorescences. T1 plants showing more moderate effects had poor seedyield.

Overexpression of G354 in Arabidopsis resulted in seedlings with analtered response to light. In darkness, G354 seedlings failed toetiolate. The phenotype was most severe in seedlings from one line whereoverexpression of the transgene resulted in reduced open and greenishcotyledons.

Potential Applications

G354 or its equivalogs can be used to alter inflorescence structure,which may have value in production of novel ornamental plants.

G354 modifies the light response and thus G354 or its equivalogs may beuseful for modifying plant growth or development, for example,photomorphogenesis in poor light, or accelerating flowering time inresponse to various light intensities, quality or duration to which anon-transformed plant would not similarly respond. Elimination ofshading responses may lead to increased planting densities withsubsequent yield enhancement.

G481 (SEQ ID NO: 87 and 88)

Published Information

G481 is a member of the HAP3 subgroup of the CCAAT-box bindingtranscription factor family, and is equivalent to AtHAP3a which wasidentified by Edwards et al. ((1998) Plant Physiol. 117:1015-1022) as anEST with extensive sequence homology to the yeast HAP3. Northern blotdata from five different tissue samples indicated that G481 wasprimarily expressed in flower and/or silique, and root tissue.

Experimental Observations

We have now generated additional sets of 35S lines for G481 using thetwo component system. Two batches of 2-component 35S lines were obtained(lines 301-320 and 741-751). The majority of plants from these T1 setsdisplayed no consistent difference in morphology to wild-type controls.However, a number of individuals were observed to be late flowering(#305, 310, 315, 744, 748) under the conditions of continuous light.

To confirm the above phenotype, a selection of T2 lines were examinedunder 24-hour light conditions; late flowering was also observed in thatgeneration in a significant number of these plants, as detailed below:

T2-303: 6/6 plants were slightly late flowering (approximately 2-3 daysafter controls).

T2-307: 4/6 plants were slightly late flowering (approximately 2-3 daysafter controls), 2/6 appeared wild type.

T2-309: 4/6 plants were slightly late flowering (approximately 2-3 daysafter controls), 2/6 appeared wild type.

T2-310: 6/6 plants were moderately late flowering (1-2 weeks aftercontrols).

T2-312: 6/6 plants were moderately late flowering (approximately 1 weekafter controls).

T2-741: 6/6 plants appeared wild-type.

T2-742: 6/6 plants appeared wild-type.

T2-744: 6/6 plants appeared wild-type.

In addition to analyzing these two component lines, we also re-examinedsome of the 35S::G481 lines that we had generated during our genomicsscreens. 35S::G481 line 3 was back-crossed to wild-type and F1 plantswere examined. 18/18 of these F1 plants showed a moderate delay in theonset of flowering by about 1-2 weeks.

Of the ten two-component lines submitted for physiological assays, thefollowing showed a segregation on selection plates in the T2 generationthat was compatible with the transgene being present at a single locus:304, 309, 312, 317, 741, 748. Lines 305, 315, 318, 742 showedsegregations that were compatible with insertions at multiple loci.

In plate-based physiology assays, tolerance was seen to drought relatedstress: four of ten two-component lines tested were less sensitive inthe ABA germination assay, and two of these lines also displayed coldtolerance in a germination assay.

G481 overexpressing plants were found to be more tolerant to drought ina soil-based assay.

TABLE 17 G481 35S, 2-components-supertransformation (supTfn) Germ. inGerm. in high Germ. Germ. Growth Line high NaCl mannitol Sucrose ABA inheat in cold in heat Drought Chilling 304 wt wt wt + wt wt wt wt wt 305wt wt wt wt wt wt wt wt wt 309 wt wt wt wt wt wt wt wt wt 312 wt wt wt +wt wt wt wt wt 315 wt wt wt + wt + wt wt wt 317 wt wt wt wt wt wt wt wtwt 318 wt wt wt + wt + wt wt wt 741 wt wt wt wt wt wt wt wt wt 744 wt wtwt wt wt wt wt wt wt 748 wt wt wt wt wt wt wt wt wtPotential Applications

The results of these latest overexpression studies confirm our earlierconclusion that G481 and its equivalogs are excellent candidates forimprovement of drought related stress tolerance in commercial species.Additionally, G481 related genes could also be used to manipulateflowering time.

G482 (SEQ ID NO: 89 and 90)

Published Information

G482 is a member of the HAP3 subgroup of the CCAAT-box bindingtranscription factor family, and is equivalent to AtHAP3b which wasidentified by Edwards et al. ((1998) Plant Physiol. 117:1015-1022) as anEST with homology to the yeast gene HAP3b. Edwards' northern blot datasuggests that AtHAP3b is expressed primarily in roots. No otherfunctional information regarding G482 is publicly available.

Experimental Observations

RT-PCR analysis of endogenous levels of G482 transcripts indicated thatthis gene was expressed constitutively in all tissues tested. A cDNAarray experiment supported the RT-PCR derived tissue distribution data.G482 was not induced above basal levels in response to any environmentalstress treatments tested.

We have now generated 35S lines for G482 using the two component system;two batches of T1 lines (321-341 and 341-360) were examined and many ofthe plants showed a striking acceleration of flowering (1-2 weeks soonerthan wild-type) under 24 hour light conditions.

The early flowering effect was seen in 10/20 lines from the 321-341 set(#323, 325, 326, 327, 329, 330, 332, 333, 335, 336) and 7/20 lines fromthe 341-360 set (#341, 351, 355, 356, 357, 358, 360). Comparable effectson flowering time were also seen in each of three T2 populations (329,330, 333) that were morphologically examined. In addition to theaccelerated flowering, the majority of 35S::G482 lines also displayed aslight reduction in overall size; in fact a number of lines were verysmall (#321, 328, 331, 338, 339, 340 and #342, 344, 345, 349, 350) anddid not survive to maturity.

All of the two-component lines showed segregation on selection plates inthe T2 generation that was compatible with the transgene being presentat a single locus.

G482 function was analyzed through its ectopic overexpression in plantsunder the control of a 35S promoter using the two component system.

In plate-based physiology assays, half of the two-component lines testedshowed increased vigor on mannitol media and three lines performedbetter than wild-type in the heat germination assay.

It should be emphasized that we have observed stress-related tolerancephenotypes for several other G481 related genes including G485 andG1820. The similar effects seen when these genes are overexpressedstrongly suggest a functional relationship between them.

TABLE 18 G482 35S, 2-components-supTfn Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 324 wt wt wt wt wt wt wt wt wt 327 wt wt wt wt wt wt wtwt wt 330 wt wt wt wt wt wt wt wt wt 333 wt wt wt wt wt wt wt wt wt 343wt wt wt wt wt wt wt wt wt 346 wt + wt wt wt wt + wt wt 347 wt + wt wtwt wt wt wt wt 351 wt + wt wt wt wt + wt wt 353 wt + wt wt wt wt wt wtwt 354 wt + wt wt wt wt + wt wtPotential Applications

The results of this study bolster our conclusion that G481 and therelated genes, including G482, are excellent candidates for improvementof drought related stress tolerance in commercial species.

Additionally, G482 could be useful for manipulating flowering time.

Arabidopsis G485 (SEQ ID NO: 2009 and 2010)

G485 is paralog of G481, and is a member of the HAP3 subgroup of theCCAAT-box binding transcription factor family. It has been referenced assequence 1042 from Patent Application WO0216655 on stress-regulatedgenes, transgenic plants and methods of use. G485 was reported thereinto be cold responsive in a microarray analysis (Harper et al. (2002)Patent Application WO 0216655-A 1042 28-Feb.-2002). No other functionalinformation regarding G485 is publicly available.

During our earlier genomics program, we determined that plantsoverexpressing G485 had accelerated flowering, bolting up to 1 weekearlier than wild-type or non-transformed plants grown under 24 hrlights. These studies, combined with related studies on plants lackingG485 expression have implicated G485 as both sufficient to act as afloral activator, and also necessary in that role within the plant.

Experimental Observations

The aim of this study was to re-assess the effects of overexpression ofG485 using a two-component system and determine if this gene can conferenhanced stress tolerance in a manner comparable to G481.

We have now generated 35S lines for G485 using the two component system.These plants showed a comparable early flowering phenotype to thatobserved for 35S direct promoter fusion lines in our earlier genomicsprogram.

Many of the 35S::G485 two-component lines exhibited a markedacceleration in the onset of flowering and generally formed flower buds1-2 weeks sooner than wild type under continuous light conditions. Manyof the lines also showed a reduction in rosette biomass compared to wildtype. In fact, three of twenty lines (308, 312, 320) showed a severedwarf phenotype and did not survive to maturity. Early flowering wasexhibited by 11/20 of the T1 lines (#301, 302, 303, 304, 306, 307, 309,313, 315, 317, 319). The remaining lines appeared wild-type, apart fromlines 310 and 314, which were noted to be slightly delayed in the onsetof flowering. Line 14 was also infertile and failed to yield seed.Flowering time was also assessed in a number of T2 populations: plantsfrom the T2-302, T2-305, T2-307, and T2-309 all displayed earlyflowering comparable to that seen in the parental lines. Plants from theT2-310 and T2-311 populations flowered at the same time as controls.

All of the ten two-component lines submitted for physiological assaysshowed segregation on selection plates in the T2 generation that wascompatible with the transgene being present at a single locus.

In plate-based physiology assays, 35S::G485 two-component lines weremore tolerant to salt stress in a germination assay compared towild-type or non-transformed seedlings. Several salt tolerant lines werealso less sensitive to sucrose, ABA, and cold stress in separategermination assays.

TABLE 19 G485 35S, 2-components-supTfn Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 302 + wt wt + wt wt wt wt wt 305 + wt wt + wt + wt wtwt 307 + wt wt wt wt wt wt wt wt 309 wt wt wt wt wt wt wt wt wt 310 + wtwt wt wt + wt wt wt 311 + wt + wt wt + wt wt wt 316 + wt + + wt + wt wtwt 317 + wt wt wt wt wt wt wt wt 318 wt wt wt + wt + wt wt wt 319 + wt +wt wt + wt wt wtPotential Applications

Based on the results of these abiotic stress assays indicate that G481and related genes, including G485, are excellent candidates forimprovement of drought related stress tolerance in commercial species.

Additionally, G485 could be used to manipulate flowering time.

G489 (SEQ ID NO: 93 and 94)

Published Information

G489 was identified from a BAC sequence that showed high sequencehomology to AtHAP5-like transcription factors in Arabidopsis. During ourearlier genomics program, we observed that plants overexpressing G489were tolerant of NaCl and mannitol in separate germination assays.

Morphologically, the plants were similar to wild-type or non-transformedplants.

Experimental Observations

Two sets of 35S::G489 lines (301-320 and 421-440) were generated. Lines301-320 harbored the 35S direct promoter-fusion construct (P51). Thesecond batch of plants (421-440) overexpressed G489 via the twocomponent system. Neither of the two sets of 35S::G489 plants showed anyconsistent differences in morphology to wild type controls.

The function of G489 was analyzed through its ectopic overexpression inplants. G489 overexpressors were more tolerant to high NaCl stress,showing more root growth and leaf expansion compared with the controlsin culture. Two well characterized ways in which NaCl toxicity ismanifested in the plant is through general osmotic stress and potassiumdeficiency due to the inhibition of its transport. These G489overexpressor lines were more tolerant to osmotic stress in general,showing more root growth on mannitol containing media. G489overexpressors were also more tolerant to drought than wild-type controlplants in soil-based assays.

RT-PCR analysis of endogenous levels of G489 transcripts indicated thatthis gene was expressed constitutively in all tissues tested. A cDNAarray experiment confirmed the RT-PCR derived tissue distribution data.G489 was not induced above basal levels in response to stress treatmentstested.

Several 35S::G489 lines derived from direct promoter fusions werecreated and characterized morphologically. These plants showed noconsistent differences in morphology from wild type controls.

As shown in Table 20, five of ten G489-overexpressing lines tested werebetter able to germinate in cold conditions. Three lines of more matureplants were also better able to tolerate cold conditions.

TABLE 20 G489 35S, Direct promoter-fusion and 2-components-supTfn Germ.Germ. in in high high Germ Germ Growth Line Transformation NaCl mannitolSucrose ABA in heat in cold in heat Drought Chilling 301 Direct wt wt wtwt wt wt wt wt + fusion 302 Direct wt wt wt wt wt + wt wt wt fusion 303Direct wt wt wt wt wt + wt wt wt fusion 304 Direct wt wt wt wt wt wtwt + + fusion 305 Direct wt wt wt wt wt + wt + + fusion 308 Direct wt wtwt wt wt + wt wt wt fusion 310 Direct wt wt wt wt wt + wt wt wt fusion314 Direct wt wt wt wt wt wt wt wt wt fusion 316 Direct wt wt wt wt wtwt wt wt wt fusion 317 Direct wt wt wt wt wt wt wt wt wt fusion 4212-comp wt wt wt wt wt wt + 422 2-comp wt wt wt wt wt wt wt 423 2-comp wtwt wt wt wt wt wt 424 2-comp wt wt wt wt wt wt wt 425 2-comp wt wt wt wtwt wt wt 426 2-comp wt wt wt wt wt wt wt 430 2-comp wt wt wt wt wt wt wt434 2-comp wt wt wt wt wt wt wt 437 2-comp wt wt wt wt wt wt + 4402-comp wt wt wt wt wt + wt wtPotential Applications

The results of this study bolster our conclusion that G481 and therelated genes, including G489, are excellent candidates for improvementof drought related stress tolerance in commercial species.

G634 (SEQ ID NO: 127 and 128)

Published Information

G634 was initially identified as public partial cDNAs sequences for GTL1and GTL2 which are splice variants of the same gene (Small et al (1998)Proc. Natl. Acad. Sci. USA. 95:3318-3322). The published expressionpattern of GTL1 shows that G634 is highly expressed in siliques and notexpressed in leaves, stems, flowers or roots. There is no publishedinformation on the function of G634.

Closely Related Genes from Other Species

The closest non-Arabidopsis relative of G634 is the O. sativa gt-2 gene(EMBO J. (1992) 11:4131-4144), which is proposed to bind and regulatethe phyA promoter. In addition, the pea DNA-binding protein DF1(13786451) shows strong homology to G634. The homology of these proteinsto G634 extends to outside of the conserved domains and thus these genesare likely to be orthologs of G634.

Experimental Observations

The boundaries of G634 in were experimentally determined and thefunction of G634 was investigated by constitutively expressing G634using the CaMV 35S promoter.

Three constructs were made for G634: P324, P1374 and P1717. P324 wasfound to encode a truncated protein. P1374 and P1717 represent fulllength splice variants of G634; P1374, the shorter of the two splicevariants was used for the experiments described here. The longestavailable cDNA (P1717), confirmed by RACE, has the same ATG and stopcodons as the genomic sequence.

Plants overexpressing G634 from construct P1374 showed a dramaticincrease the density of trichomes, which additionally appear larger insize. The increase in trichome density was most noticeable on laterarising rosette leaves, cauline leaves, inflorescence stems and sepalswith the stem trichomes being more highly branched than controls.Approximately half of the primary transformants and two of three T2lines showed the phenotype. Apart from slight smallness, there did notappear to be any other clear phenotype associated with theoverexpression of G634. However, a reduction in germination was observedin T2 seeds grown in culture. It is not clear whether this defect wasdue to the quality of the seed lot tested or whether this characteristicis related to the transgene overexpression.

RT PCR data showed that G634 is potentially preferentially expressed inflowers and germinating seedlings, and induced by auxin. The role ofauxin in trichome initiation and development has not been established inthe published literature.

The increase in trichome density observed in G634 overexpressorssuggested a possible role for this gene in drought-stress tolerance, apresumption subsequently confirmed in soil-based drought assays. Wetested lines overexpressing G634 in a soil drought assay and found thatthey showed an enhanced performance versus wild type; G634overexpressors recovered from the effects of a drought treatmentsignificantly better than wild-type control plants. Additionally, ourrecent array experiments on plants undergoing a soil-drought experimentindicated that G634 shows a small but significant up-regulationspecifically in the recovery phase following re-watering.

G634 overexpressing lines did not exhibit a shade avoidance phenotypewhen grown under light deficient in the red region of the visiblespectrum. When the assay was repeated on individual lines, all threelines analyzed showed the phenotype and had short hypocotyls comparedwith wild-type seedlings. On control plates, seedlings from line 5 weresmall while lines 6 and 8 were comparable in size to wild-typeseedlings.

Potential Applications

Trichome glands on the surface of many higher plants produce and secreteexudates that give protection from the elements and pests such asinsects, microbes and herbivores. These exudates may physicallyimmobilize insects and spores, may be insecticidal or ant-microbial orthey may allergens or irritants to protect against herbivores. Trichomeshave also been suggested to decrease transpiration by decreasing leafsurface air flow, and by exuding chemicals that protect the leaf fromthe sun.

Depending on the plant species, varying amounts of diverse secondarybiochemicals (often lipophilic terpenes) are produced and exuded orvolatilized by trichomes. These exotic secondary biochemicals, which arerelatively easy to extract because they are on the surface of the leaf,have been widely used in such products as flavors and aromas, drugs,pesticides and cosmetics. One class of secondary metabolites, thediterpenes, can effect several biological systems such as tumorprogression, prostaglandin synthesis and tissue inflammation. Inaddition, diterpenes can act as insect pheromones, termite allomones,and can exhibit neurotoxic, cytotoxic and antimitotic activities. As aresult of this functional diversity, diterpenes have been the target ofresearch several pharmaceutical ventures. In most cases where themetabolic pathways are impossible to engineer, increasing trichomedensity or size on leaves may be the only way to increase plantproductivity.

Thus, the use of G634 and its equivalogs to increase trichome density,size or type may therefore have profound utilities in so calledmolecular farming practices (i.e. the use of trichomes as amanufacturing system for complex secondary metabolites), and inproducing resistant insect and herbivore resistant plants.

Of particular significance, G634 and its equivalogs may also be used toincrease the osmotic stress, drought and shade tolerance of plants.

G682 (SEQ ID NO: 147 and 148)

Published Information

G682 was identified from the Arabidopsis BAC, AF007269, based onsequence similarity to other members of the Myb family within theconserved domain.

Experimental Observations

RT-PCR analysis of the endogenous levels of G682 transcripts indicatedthat this gene was expressed in all tissues tested, however, a very lowlevel of transcript was detected in roots and shoots. Array tissue printdata indicated that G682 was expressed primarily, but not exclusively,in flower tissue.

An array experiment was performed on one G682 overexpressing line. Thedata from this one experiment indicated that this gene could be anegative regulator of chloroplast development and/or light dependentdevelopment because the gene Albino3 and many chloroplast genes arerepressed. Albino3 functions to regulate chloroplast development(Sundberg et al (1997) Plant Cell 9:717-730). The gene G682 was itselfinduced 20-fold. Other than a few additional transcription factors, veryfew genes are induced as a result of the ectopic expression of G682.

The function of G682 was analyzed through its ectopic overexpression inplants. G682 overexplessors were glabrous, had tufts of more root hairsand germinated better under heat stress conditions. Older plants werenot more tolerant to heat stress compared to wild-type controls.

G682 overexpressors are glabrous, have tufts of more root hairs andgerminated better under heat stress conditions. Older plants were notmore tolerant to heat stress compared to wild-type controls. At the timethese experiments were performed, it was suggested that furtherexperiments were needed to address whether or not the heat germinationphenotype of the G682 overexpressors was related to water deficit stresstolerance in the germinating seedling, and correlated with a possibledrought tolerance phenotype. More recent experiments have shown thatG682 overexpressors were more tolerant to water deprivation conditionsin soil-based drought assays than wild-type plants, and two of threelines tested were significantly more drought tolerant than the wild-typecontrols.

In addition to the paralogous sequences disclosed above, orthologoussequences from other plant species were also identified using BLASTanalysis. Such orthologous sequences, together with the paralogoussequences were determined to be members of die G682 TF family ofMyb-related proteins (equivalogs). The paralogous sequences and theorthologous sequences were aligned using MACVECTOR software (Accelrys,Inc.). The software program also generated an exemplary consensus aminoacid residue sequence of the aligned sequences.

As shown in FIGS. 3A and 3B, the orthologous sequences shared aconsensus sequence with the conserved domain of G682 (amino acidresidues 27-63 of SEQ ID NO: 148) and also shared identity with regionsflanking the conserved domain (flanking regions). In particular, G682shared a region of the conserved domain with sequences from soy (Glycinemax; SEQ ID NOs: 1084, 1085, 1086, 1083, 1087, and 1088), rice (Oryzasativa; SEQ ID NOs: 559, 1082, and 1081), and maize (corn) (Zea mays;SEQ ID NOs: 1089 and 1090).

An exemplary consensus of the conserved domain of the G682 TF family ofMyb-related proteins isVal-Xaa-Met/Phe-Ser/Thr-Gln/Glu-Xaa-Glu-Glu-Asp-Leu-Val-Xaa-Arg-Met-His/Tyr-Lys/Arg-Leu-Val-Gly-Asp/Glu-Arg/Lys-Trp-Glu/Asp-Leu/Ile-Ile-Ala-Gly-Arg-Ile/Val-Pro-Gly-Arg,where Xaa is any amino acid residue. An alternative exemplary consensusof the conserved domain isVal-Xaa-Met/Phe-Sertbr-Gin/Glu-Xaa-Glu-Glu-Asp-Leu-Val-Ser-Arg-Met-Iis-Arg-Leu-Val-Gly-Asn-Arg-Trp-Glu-Leu-Ile-Ala-Gly-Arg-Ile-Xaa-Gly-Arg,where Xaa is any amino acid residue. A further alternative exemplaryconsensus of the conserved domain isVal-Xaa-Met/Phe-Ser/Thr-Gln/Glu-Xaa-Glu-Glu-Asp-Leu-Val-Ser-Arg-Met-Tyr-Xaa-Leu-Val-Gly-Asn/Glu-Arg-Trp-Ser-Leu-Ile-Ala-Gly-Arg-Ile-Pro-Gly-Arg,where Xaa is any amino acid residue.

Potential Applications

The potential utility of this gene or its equivalogs is to confer heattolerance to germinating seeds.

G682 or its equivalogs could be used to alter trichome number anddistribution in plants. Trichome glands on the surface of many higherplants produce and secrete exudates, which give protection from theelements and pests such as insects, microbes and herbivores. Theseexudates may physically immobilize insects and spores, may beinsecticidal or ant-microbial or they may allergens or irritants toprotect against herbivores. Trichomes have also been suggested todecrease transpiration by decreasing leaf surface air flow, and byexuding chemicals that protect the leaf from the sun.

G864 (SEQ ID NO: 167 and 168)

Published Information

G864 was identified in an Arabidopsis EST (H37693). G864 appears as geneAT4g23750 in the annotated sequence of Arabidopsis chromosome 4(AL161560).

Experimental Observations

G864 was discovered and initially identified as a public ArabidopsisEST. G864 was ubiquitously expressed, and was not significantly inducedunder any of the conditions tested.

The complete sequence of G864 was determined, and G864 was found to berelated to two additional Arabidopsis AP2/EREBP genes, G1421 and G1755.The function of G864 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. G864overexpressing plants exhibited a variety of phenotypic alterations.They were smaller than wild-type plants, and those with the strongestphenotypes were classified as dwarf. However, G864 overexpressing linesshowed more seedling vigor in a heat stress tolerance germination assaycompared to wild-type controls. Conversely, G864 overexpressing lineswere also somewhat more sensitive to chilling. One of the three T2 linesanalyzed showed significant increase in fucose and arabinose levels inleaves.

In soil-based assays, G864 overexpressing plants were significantly moredrought tolerant than wild-type control plants.

Potential Applications

The germination of many crops is very sensitive to temperature. A genethat would enhance germination in hot conditions such as G864 or itsequivalogs would be useful for crops that are planted late in the seasonor in hot climates. G864 and its equivalogs may also be used to improvethe drought or other osmotic stress tolerance of plants.

G867 (SEQ ID NO: 169 and 170)

Published Information

G867 corresponds to RAV1 (Kagaya et al. (1999) Nucleic Acids Res. 27:470-478). G867/RAV1 belongs to a small subgroup within the AP2/EREBPfamily of transcription factors, whose distinguishing characteristic isthat its members contain a second DNA-binding domain, in addition to theconserved AP2 domain, that is related to the B3 domain of VP1/ABI3(Kagaya et al. (1999) supra). It has been shown that the two DNA-bindingdomains of RAV1 can separately recognize each of two motifs thatconstitute a bipartite binding sequence and together cooperativelyenhance its DNA-binding affinity and specificity (Kagaya et al. (1999)supra).

Experimental Observations

G867 was discovered and initially identified as a public ArabidopsisEST. G867 appeared to be constitutively expressed at medium levels.

G867 was first characterized using a line that contained a T-DNAinsertion in the gene. The insertion in that line resided immediatelydownstream of the conserved AP2 domain, and would therefore be expectedto result in a severe or null mutation. G867 knockout mutant plants didnot show significant changes in overall plant morphology, significantdifferences between these plants and control plants have not beendetected in any of the assays that have been performed so far.

Subsequently, the function of G867 was analyzed using transgenic plantsin which this gene was expressed under the control of the 35S promoter.G867 overexpressing lines were morphologically wild-type and nophenotypic alterations in G867 overexpressing lines were detected in thebiochemical assays that were performed. However, G867 overexpressinglines showed increased seedling vigor (manifested by increased expansionof the cotyledons) in germination assays on both high salt and highsucrose containing media, compared to wild-type controls.

The Arabidopsis paralogs G1930 (SEQ ID NO: 369) and G9 (SEQ ID NO: 1949)also showed stress related phenotypes. G9 exhibited increased rootbiomass, and thus could be used to produce better plant growth underadverse osmotic conditions. Genetic and physiological evidence indicatesthat roots subjected to various stresses, including water deficit, alterthe export of specific compounds, such as ACC and ABA, to the shoot, viathe xylem Bradford et al. (1980) Plant Physiol. 65: 322-326; Schurr etal. (1992) Plant Cell Environ. 15, 561-567).

G1930 plants responded to high NaCl and high sucrose on plates with moreseedling vigor, and root biomass compared to wild-type control plants;this phenotype was identical to that seen in 35S::G867 lines. Theseresults indicate a general involvement of this clade in abiotic stressresponses:

The polypeptide sequences of G1930 and G9 share 72% (249/345 residues)and 64% (233/364 residues) with G867, respectively. The conserveddomains of G1930 and G9 are 86% (56/65 residues) and 86% (56/65residues) identical with the conserved domain of G867, respectively.

In addition to the paralogous sequences disclosed above, orthologoussequences from other plant species were also identified using BLASTanalysis. Such orthologous sequences, together with the paralogoussequences were determined to be members of the G867 TF family of AP2proteins (equivalogs). The paralogous sequences and the orthologoussequences were aligned using MACVECTOR software (Accelrys, Inc.). Thesoftware program also generated an exemplary consensus amino acidresidue sequence of the aligned sequences.

As shown in FIGS. 4A, 4B, 4C, and 4D, the orthologous sequences shared aconsensus sequence with the conserved domain of G867 (amino acidresidues 59-116 of SEQ ID NO: 170) and also shared identity with regionsflanking the conserved domain (flanking regions). In particular, G867shared a region of the conserved domain with sequences from soy (Glycinemax; SEQ ID NOs: 1184, 1183, and 1182), rice (Oryza sativa; SEQ ID NOs:1176, 1177, and 1178), and maize (corn) (Zea mays; SEQ ID NOs: 1186 and1185).

An exemplary consensus of the conserved domain of the G867 TF family ofAP2 proteins isSer-Ser-Lys/Arg-Tyr/Phe-Gly-Val-Val-Pro-Gln-Pro-Asn-Gly-Arg-Typ-Gly-Ala-Gln-Ile-Tyr-Glu-Lys/Arg-His-Gln-Arg-Val-Trp-Leu-Gly-Thr-Phe-Xaa-Glu/Asp-Glu-Glu-Glu/Asp-Ala-Ala/Val-Arg-Ala/Ser-Tyr-Asp-Val/Ile-Ala/Val-Val/Ala-Xaa-Arg-Phe/Tyr-Arg-Arg/Gly-Arg-Asp-Ala-Val-Thr/Val-Asn-Phe-Lys/Arg,where Xaa is any amino acid residue. An alternative exemplary consensusof the conserved domain isSer-Ser-Lys/Arg-Tyr/Phe-Gly-Val-Val-Pro-Gln-Pro-Asn-Gly-Arg-Typ-Gly-Ala-Gln-Ile-Tyr-Glu-Lys/Arg-His-Gln-Arg-Val-Trp-Leu-Gly-Thr-Phe-Xaa-Glu/Asp-Glu-Glu/Asp-Ala-Ala-Ala-Arg-Ala-Tyr-Asp-Val/Ile-Alawal-Val/Ala-Xaa-Arg-Phe/Tyr-Arg-Arg/Gly-Arg-Asp-Ala-Val-Thr/Val-Asn,where Xaa is any amino acid residue. A further alternative exemplaryconsensus of the conserved domain isSer-Ser-Lys/Arg-Tyr/Phe-Gly-Val-Val-Pro-Gln-Pro-Asn-Gly-Arg-Typ-Gly-Ala-Gln-Ile-Tyr-Glu-Lys/Arg-His-Gln-Arg-Val-Trp-Leu-Gly-Thr-Phe-Xaa-Gly-Glu-Ala/Asp-Glu/Asp-Ala-Ala/Val-Arg-Ala-Tyr-Asp-Val-Ala-Ala-Gln-Arg-Phe/Tyr-Arg-Arg/Gly-Arg-Asp-Ala-Val-Thr/Val-Asn-Phe-Arg,where Xaa is any amino acid residue.

Potential Applications

G867 or its equivalogs could be used to increase or facilitate seedgermination and seedling growth under adverse environmental conditions,in particular salt stress.

G867 or its equivalogs may also be used to modify sugar sensing.

G912 (SEQ ID NO: 185 and 186)

Published Information

G912 was identified in the sequence of P1 clone MSG15 (GenBank accessionnumber AB015478; gene MSG15.6).

Experimental Observations

G912 was recognized as the AP2/EREBP gene most closely related toArabidopsis CBF1, CBF2, and CBF3 (Stockinger et al (1997) Proc. Natl.Acad. Sci. USA 94:1035-1040; Gilmour et al. (1998) Plant J.16:433-442).In fact, G912 is the only other AP2/EREBP transcription factor for whichsequence similarity with CBF1, CBF2, and CBF3 extends beyond theconserved AP2 domain.

The function of G912 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Plantsoverexpressing G912 were more freezing and drought tolerant than thewild-type controls, but were also small, dark green, and late flowering.There was a positive correlation between the degree of growth impairmentand the freezing tolerance. In addition, G912 expression appeared to beinduced by cold, drought, and osmotic stress.

In addition, G912 overexpressing plants also exhibited a sugar sensingphenotype: reduced seedling vigor and cotyledon expansion upongermination on high glucose media.

These results mirror the extensive body of work that has shown thatCBF1, CBF2, and CBF3 are involved in the control of the low-temperatureresponse in Arabidopsis, and that those genes can be used to improvefreezing, drought, and salt tolerance in plants (Stockinger et al.,(1997) Proc. Natl. Acad. Sci. USA 94:1035-1040; Gilmour et al. (1998)Plant J.16:433-442; Jaglo-Ottosen et al. (1998) Science. 280:104-106;Liu et al. (1998) Plant Cell. 10:1391-1406, Kasuga et al. (1999) Nat.Biotechnol. 17:287-291).

The polypeptide sequences of G40, G41, and G42 share 71% (140 of 195residues), 68% (144 of 211 residues), and 65% (147 of 224 residues)identity with G912, respectively. The conserved domains of G40, G41, andG42 share 94% (64 of 68 residues), 92% (63 of 68 residues), and 94% (64of 68 residues) identity with G912, respectively.

In addition to the paralogous sequences disclosed above, orthologoussequences from other plant species were also identified using BLASTanalysis. Such orthologous sequences, together with the paralogoussequences were determined to be members of the G912 TF family ofAP2/EREBP proteins (equivalogs). The paralogous sequences and theorthologous sequences were aligned using MACVECTOR software (Accelrys,Inc.). The software program also generated an exemplary consensus aminoacid residue sequence of the aligned sequences.

As shown in FIGS. 5A, 5B, 5C, 5D, 5E, and 5F, the orthologous sequencesshared a consensus sequence with the conserved domain of G912 (aminoacid residues 51-118 of SEQ ID NO: 186) and also shared identity withregions flanking the conserved domain (flanking regions). In particular,G912 shared a region of the conserved domain with sequences from soy(Glycine max; SEQ ID NOs: 1238, 1242, 1240, 1241, and 1243), rice (Oryzasativa; SEQ ID NOs: 1222, 1223, 1232, 1221, 1231, 1227, 1235, 1230,1229, and 1228), and maize (corn) (Zea mays; SEQ ID NOs: 1246, 1247,1244, and 1245).

An exemplary consensus of the conserved domain of the G912 TF family ofAP2/EREBP proteins isHis-Pro-Ile/Val-Tyr/Phe-Arg/Lys-Gly-Val-Arg-Gln/Arg-Arg-Gly/Asn-Xaa₍₁₋₃₎-Lys/Arg-Trp-Val-Cys/Ser-Glu-Val/Leu-Arg-Glu/Val-Pro-Asn-Lys-Xaa₍₂₎-Arg-Ile/Leu-Trp-Leu-Gly-Thr-Phe/Tyr-Xaa₍₂₎-Ala/Pro-Glu-Met-Ala-Ala-Arg-Ala-His-Asp-Val-Ala-Ala/Met-Leu/Met-Ala-Leu-Arg-Gly-Xaa₍₁₋₈₎-Ala-Cys-Leu-Asn-Phe-Ala-Asp-Ser-Xaa₍₁₋₅₎-Val/Ile-Pro/Asp,where Xaa is any amino acid residue.

An alternative exemplary consensus of the conserved domain isHis-Pro-Ile/Val-Tyr/Phe-Arg/Lys-Gly-Val-Arg-Xaa-Arg-Gly/Asn-Xaa₍₁₋₃₎-Lys/Arg-Trp-Val-Cys/Ser-Glu-Val/Leu-Arg-Glu/Val-Pro-Xaa₍₁₋₅₎-Arg-Ile/Leu/Phe-Trp-Leu-Gly-Thr-Phe/Tyr-Xaa₍₂₎-Ala/Pro-Glu-Xaa-Ala-Ala-Arg-Ala-His-Asp-Val-Ala-Ala/Met-Leu/Met-Ala-Leu-Arg-Gly-Xaa₍₁₋₈₎-Ala-Cys/Ser-Leu-Asn-Phe-Ala-Asp-Ser-Xaa₍₁₋₅₎-Val/Ile-Pro/Asp,where Xaa is any amino acid residue.

An exemplary flanking region consensus sequence of the G912 TF family ofAP2/EREBP proteins is Pro-Lys-Xaa-Xaa-Ala-Gly-Arg (amino acids 37-43 ofSEQ ID NO: 186), or Ala-Gly-Arg-Xaa-Lys-Phe (amino acids 41-46 of SEQ IDNO: 186) or Glu-Thr-Arg-His-Pro (amino acids 48-52 of SEQ ID NO: 186),where Xaa is any amino acid residue.

Potential Applications

G912 or its equivalogs could be used to improve plant tolerance to cold,freezing, drought, and salt stress. In addition, G912 or its equivalogscould be used to change a plant's flowering time and size.

G913 (SEQ ID NO: 187 and 188)

Published Information

G913 was identified in the sequence of clone MSG15; it corresponds togene MSG15.10 (GenBank PID BAB11050).

Experimental Observations

The cDNA sequence of G913 was determined. To investigate the function(s)of G913, this gene was expressed under the control of the 35S promoterin transgenic plants. G913 overexpressing plants had dark green leavesthat occasionally curled downward. These plants showed a delay inflowering and were also late senescing. Overexpressing G913 lines weremore freezing tolerant and more drought tolerant than the wild-typecontrols.

In an ethylene sensitivity assay where the plants were tested for atriple response phenotype on plates containing ACC, G913 overexpressingplants showed stunting and curling in the hypocotyl region that was moreexaggerated than the wild type triple response.

Potential Applications

G913 or its equivalogs could be used to improve plant tolerance tofreezing and drought. G913 could also be used to manipulate the ethyleneresponse.

G913 or its equivalogs may be used to delay flowering or senescence inplants. Extending vegetative development could bring about largeincreases in yields.

Additionally, a major concern is the escape of transgenic pollen fromGMOs to wild species or so-called organic crops. Systems that preventvegetative transgenic crops from flowering would eliminate this worry.

G922 (SEQ ID NO: 189 and 190)

Published Information

G922 corresponds to Scarecrow-like 3 (SCL3) first described by Pysh etal. (GenBank accession number AF036301; (1999) Plant J. 18: 111-119).Northern blot analysis results show that G922 is expressed in siliques,roots, and to a lesser extent in shoot tissue from 14 day old seedlings.Pysh et al did not test any other tissues for G922 expression. In situhybridization results showed that G922 was expressed predominantly inthe endodermis in the root tissue. This pattern of expression was verysimilar to that of SCARECROW (SCR), G306. Experimental evidenceindicated that the co-localization of the expression is not due tocross-hybridization of the G922 probe with G306. Pysh et al proposedthat G922 may play a role in epidermal cell specification and that G922may either regulate or be regulated by G306.

The sequence for G922 can also be found in the annotated BAC cloneF11F12 from chromosome 1 (GenBank accession number AC012561). Thesequence for F11F12 was submitted to GenBank by the DNA Sequencing andTechnology Center at Stanford University.

Experimental Observations

The function of this gene was analyzed using transgenic plants in whichG922 was expressed under the control of the 35S promoter. Transgenicplants overexpressing G922 were more salt tolerant than wild-type plantsas determined by a root growth assay on MS media supplemented with 150mM. NaCl. Plant overexpressing G922 also were more tolerant to osmoticstress as determined by germination assays in salt-containing (150 mMNaCl) and sucrose-containing (9.4%) media. Morphologically, plantsoverexpressing G922 had altered leaf morphology, coloration, fertility,and overall plant size. In wild-type plants, expression of G922 wasinduced by auxin, ABA, heat, and drought treatments. In non-inducedwild-type plants, G922 was expressed constitutively at low levels.

The high salt assays suggested that this gene would confer droughttolerance, a supposition confirmed by soil-based assays, in whichG922-overexpressing plants were significantly healthier after waterdeprivation treatment than wild-type control plants.

Potential Applications

Based upon results observed in plants overexpressing G922 or itsequivalogs could be used to alter salt tolerance, tolerance to osmoticand drought stress, and leaf morphology in other plant species.

G926 (SEQ ID NO: 191 and 192)

Published Information

G926 is equivalent to Hap2a (Y13720), a member of the CCAAT-box bindingtranscription factor family. The gene was identified by Edwards et al.((1998) Plant Physiol. 117: 1015-1022), who demonstrated that G926 orAtHap2a were able to functionally complement a Hap2 deficient mutant ofyeast suggesting that there is functional conservation between theseproteins from diverse organisms. In addition, the AtHap2a gene was shownto be ubiquitously expressed in Arabidopsis.

Experimental Observations

Consistent with the published expression pattern (Edwards et al. (1998)Plant Physiol. 117: 1015-1022), G926 was determined to be ubiquitouslyexpressed and transcript levels appeared to be unaltered by anyenvironmental stress-related condition tested. A line homozygous for aT-DNA insertion in G926 was used to determine the function of this gene.

The G926 knockout mutant line was morphologically wild-type.Physiological analysis revealed that in the presumed absence of G926function, the plants became more tolerant to high osmotic conditionsduring germination. This osmotic stress tolerance could be related tothe plant's apparent insensitivity to the growth hormone ABA. This wasthe second instance where a member of a CCAAT-box protein complexaltered the plants osmotic stress response and ABA sensitivity duringgermination.

ABA plays an important regulatory role in the initiation and maintenanceof seed dormancy. Lopez-Molina, L. et al. ((2001) Proc. Natl. Acad. Sci.USA 98: 4782-4787) describe a bZIP transcription factor, ABI5, that isinvolved in maintaining seeds in a quiescent state, preventinggermination under adverse conditions such as drought stress. It ispossible G926 also functions as part of this checkpoint for thegerminating seeds and loss of G926 function promotes germinationregardless of the osmotic status of the environment.

Potential Applications

G926 or its equivalogs could be used to improve plant tolerance todrought, and salt stress.

Evaporation from the soil surface causes upward water movement and saltaccumulation in the upper soil layer where the seeds are placed. Thus,germination normally takes place at a salt concentration much higherthan the mean salt concentration of in the whole soil profile. Increasedsalt tolerance during the germination stage of a crop plant would impactsurvivability and yield.

G975 (SEQ ID NO: 199 and 200)

Published Information

G975 has appeared in the sequences released by the Arabidopsis GenomeInitiative (BAC F9L1, GenBank accession number AC007591).

Experimental Observations

G975 was expressed in flowers and, at lower levels, in shoots, leaves,and siliques. GC-FID and GC-MS analyses of leaves from G975overexpressing plants showed that the levels of C29, C31, and C33alkanes were substantially increased (up to ten-fold) compared withcontrol plants. A number of additional compounds of similar molecularweight, presumably also wax components, also accumulated tosignificantly higher levels in G975 overexpressing plants. C29 alkanesconstituted close to 50% of the wax content in wild-type plants (Millaret al. (1998) Plant Cell 11:1889-1902), suggesting that a major increasein total wax content occurred in the G975 transgenic plants. However,the transgenic plants had an almost normal phenotype (although smallmorphological differences were detected in leaf appearance), indicatingthat overexpression of G975 was not deleterious to the plant.Overexpression of G975 did not cause the dramatic alterations in plantmorphology that had been reported for Arabidopsis plants in which theFATTY ACID ELONGATION1 gene was overexpressed (Millar et al. 1998, PlantCell 11:1889-1902). G975 may regulate the expression of some of thegenes involved in wax metabolism. One Arabidopsis AP2 sequence (G1387)that is significantly more closely related to G975 than the rest of themembers of the AP2/EREBP family is predicted to have a function and ause related to that of G975.

G975 overexpressing plants were significantly more drought tolerant thanwild-type control plants in soil-based drought assays.

Potential Applications

G975 or its equivalogs can be used to manipulate wax composition,amount, or distribution, which in turn can modify plant tolerance todrought and/or low humidity or resistance to insects, as well as plantappearance (shiny leaves).

G975 or its equivalogs can also be used to specifically alter waxcomposition, amount, or distribution in those plants and crops fromwhich wax is a valuable product.

G993 (SEQ ID NO: 2071 and 2072)

G993 is a putative paralog of G867. No genetic analysis of the locus hasbeen published.

During our earlier genomics program, we observed that G993overexpression lines exhibited a number of morphological abnormalities.The aim of this study was to re-assess 35S::G993 transformants (using agreater number of lines), and determine whether overexpression of thegene could confer enhanced stress tolerance in a comparable manner toG867.

New overexpression lines have been obtained using a direct promoterfusion construct. These lines exhibited similar phenotypes to thoseobserved during our phase I genomics studies and were generally small,slow developing, and poorly fertile. Some lines also showed featuresthat suggested a disruption in light regulated development, such as longhypocotyls and alterations in leaf orientation.

We have tested ten of the 35S::G993 direct-fusion lines under a varietyof plate based treatments. Eight of these lines out-performed controlsin one or more plate based stress assays including germination on salt,germination on sucrose, germination in the cold, or growth underchilling conditions. Interestingly, two 35S::G993 lines showed a verydramatic increase in root hair density when grown on regular MS plates(this phenotype was comparable to that observed in 35S::G9 lines duringour phase I genomics program). Interestingly, though, the two lines thatexhibited increased root hair development did not show a positive resultin the stress assays.

It should be emphasized that we have obtained comparable developmentaleffects as well as a strong enhancement of drought related stresstolerance in all four of the Arabidopsis genes in the G867 study group(G9, G867, G993, and G1930). The almost identical phenotypic effectsproduced by these genes strongly suggest that they are functionallyequivalent.

TABLE 21 G993 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 321 + wt wt wt wt wt wt wt + 322 wt wt + wt wt + wtwt + 324 wt wt wt wt wt wt wt wt + 326 + wt wt wt wt wt wt wt + 330 wtwt wt wt wt wt wt wt wt 331 + wt + wt wt wt wt wt + 334 + wt ++ wt wt ++wt wt wt 335 + wt ++ wt wt + wt wt wt 337 wt wt wt wt wt wt wt wt wt340 + wt + wt wt wt wt wt +Potential Applications

Based on the results of our overexpression studies, G867 and relatedsequences such as G993 are excellent candidates for improvement ofdrought and cold-related stress tolerance in commercial species. Themorphological effects associated with their overexpression suggests thattissue-specific or conditional promoters might be used to optimize theutility of these genes.

The increased root hair production seen in 35S::G993 lines indicatesthat the gene might be used to enhance root growth and differentiationand might thereby improve performance under other stresses, such as lownutrient availability.

G1048 (SEQ ID NO: 2515 and 2516)

Published Information

G1048 (AT1G42990) was initially identified as public partial EST T88194and in BAC F13A11 (GenBank accession AC068324) released by theArabidopsis Genome Initiative. There is no published information on thefunction(s) of G1048.

Experimental Observations.

RT-PCR expression analysis indicated that G1048 was constitutivelyexpressed and not induced by any condition tested. At that time, thefunction of G1048 was investigated by constitutively expressing G1048using the 35S promoter. Plants overexpressing G1048 were notsignificantly different to controls in any assay performed.

In initial experiments, G1048 overexpressing lines did not exhibit ashade avoidance phenotype when grown under light deficient in the redregion of the visible spectrum. This effect was seen in two repeatexperiments on a batch of mixed seed from three independent lines.

in repeat experiments, 35S::G1048 lines grown under white light versuslight deficient in red light were analyzed. A significant shadetolerance phenotype was observed, indicating that G1048 might beinvolved in the transcriptional regulation of response to shade or lightquality. G1048 is potentially related to HY5 (Oyama et al. (1997) GenesDev. 11:2983-2995), a gene that is well established to be involved inlight regulated development.

G1069 (SEQ ID NO: 221 and 222)

Published Information

The sequence of G1069 was obtained from EU Arabidopsis sequencingproject, GenBank accession number Z97336, based on its sequencesimilarity within the conserved domain to other AT-Hook related proteinsin Arabidopsis.

Experimental Observations

The sequence of G1069 was experimentally determined and the function ofG1069 was analyzed using transgenic plants in which G1069 was expressedunder the control of the 35S promoter.

Plants overexpressing G1069 showed changes in leaf architecture, reducedoverall plant size, and retarded progression through the life cycle.This is a common phenomenon for most transgenic plants in which AT-HOOKproteins are overexpressed if the gene is predominantly expressed inroot in the wild-type background. G1069 was predominantly expressed inroots, based on analysis of RT-PCR results. To minimize thesedetrimental effects, G1069 may be overexpressed under a tissue specificpromoter such as root- or leaf-specific promoter or under induciblepromoter.

One of G1069 overexpressing lines showed more tolerance to osmoticstress when they were germinated in high sucrose plates. This line alsoshowed insensitivity to ABA in a germination assay.

Potential Applications

The osmotic stress results indicate that G1069 could be used to alter aplant's response to water deficit conditions and, therefore, the gene orits equivalogs could be used to engineer plants with enhanced toleranceto drought, salt stress, and freezing.

G1069 affects ABA sensitivity, and thus when transformed into a plantthe gene or its equivalogs may diminish cold, drought, oxidative andother stress sensitivities, and also be used to alter plantarchitecture, and yield.

G1073 (SEQ ID NO: 223 and 224)

Published Information

G1073 has been identified in the sequence of a BAC clone from chromosome4 (BAC clone F23E12, gene F23E12.50, GenBank accession number AL022604),released by EU Arabidopsis Sequencing Project.

Experimental Observations

The function of G1073 was analyzed using transgenic plants in whichG1073 was expressed under the control of the 35S promoter. Transgenicplants overexpressing G1073 were substantially larger than wild-typecontrols, with at least a 60% increase in biomass. The increased mass of35S::G1073 transgenic plants was attributed to enlargement of multipleorgan types including leaves, stems, roots and floral organs. Petal sizein the 35S::G1073 lines was increased by 40-50% compared to wild typecontrols. Petal epidermal cells in those same lines were approximately25-30% larger than those of the control plants. Furthermore, 15-20% moreepidermal cells per petal were produced compared to wild type. Thus, atleast in petals, the increase in size was associated with an increase incell size as well as in cell number. Additionally, images from the stemcross-sections of 35S::G1073 plants revealed that cortical cells arelarge and that vascular bundles contained more cells in the phloem andxylem relative to wild type

Seed yield was increased compared to control plants. 35S::G1073 linesshowed an increase of at least 70% in seed yield. This increased seedproduction was associated with an increased number of siliques perplant, rather than seeds per silique.

Flowering of G1073 overexpressing plants was delayed. Leaves of G1073overexpressing plants were generally more serrated than those ofwild-type plants. Improved drought tolerance was observed in 35S::G1073transgenic lines.

Potential Applications

Transgenic plants overexpressing G1073 are large and late flowering withserrated leaves. Large size and late flowering produced as a result ofG1073 or equivalog overexpression would be extremely useful in cropswhere the vegetative portion of the plant is the marketable portion(often vegetative growth stops when plants make the transition toflowering). In this case, it would be advantageous to prevent or delayflowering with the use of this gene or its equivalogs in order toincrease yield (biomass). Prevention of flowering by this gene or itsequivalogs would be useful in these same crops in order to prevent thespread of transgenic pollen and/or to prevent seed set. This gene or itsequivalogs could also be used to manipulate leaf shape and droughttolerance.

G1075 (SEQ ID NO: 225 and 226)

Published Information

The sequence of G1075 was obtained from the Arabidopsis genomesequencing project, GenBank accession number AC004667, based on itssequence similarity within the conserved domain to other AT-Hook relatedproteins in Arabidopsis.

Experimental Observations

The function of G1075 was analyzed using transgenic plants in whichG1075 was expressed under the control of the 35S promoter.Overexpression of G1075 produced very small, sterile plants. Pointedleaves were noted in some seedlings, and twisted or curled leaves andabnormal leaf serrations were noted in rosette stage plants. Bolts wereshort and thin with short internodes. Flowers from severely affectedplants had reduced or absent petals and stamen filaments that partiallyor completely fail to elongate. Because of the severe phenotypes ofthese T1 plants, no T2 seed was produced for physiological andbiochemical analysis.

RT-PCR analysis indicated that G1075 transcripts are found primarily inroots. The expression of G1075 appeared to be induced by cold and heatstresses.

Potential Applications

G1075 or its equivalogs could be used to modify plant architecture anddevelopment, including flower structure. If expressed under aflower-specific promoter, the gene or its equivalogs might also beuseful for engineering male sterility. Because expression of G1075 isroot specific, its promoter could be useful for targeted gene expressionin this tissue.

Arabidopsis G1364 (SEQ ID NO: 2101 and 2102)

G1364, a putative paralog of G481, is a member of the HAP3 subgroup ofthe CCAAT-box binding transcription factor family.

Experimental Observations

The aim of the present study was to reevaluate the effects of G1364overexpression and determine whether the gene confers similar effects toG481. We have now obtained two sets of 35S::G1364 lines using atwo-component approach. A significant number of these transformantsshowed delayed flowering, indicating that G1364 can act as a repressorof the floral transition. Plants from this set also senesced later thanwild type. In other respects, these transformants showed wild-typemorphology.

As shown in Table 22, two G1364-overexpressing lines tested were lesssensitive to ABA than wild-type or non-transformed plants.

TABLE 22 G1364 35S, 2-components-supTfn Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 341 wt wt wt wt wt wt wt wt wt 342 wt wt wt wt wt wt wtwt wt 343 wt wt wt wt wt wt wt wt wt 344 wt wt wt wt wt wt wt wt wt 345wt wt wt wt wt wt wt wt wt 346 wt wt wt wt wt wt wt wt wt 423 wt wt wtwt wt wt wt wt wt 431 wt wt wt wt wt wt wt wt wt 432 wt wt wt + wt wt wtwt wt 435 wt wt wt + wt wt wt wt wtPotential Applications

Based on the results of the ABA germination assay, G481 and relatedsequences such as G1364 may be used to improve the stress tolerance inplants.

G1364 might also be used to modify flowering time related traits.

G1411 (SEQ ID NO: 269 and 270)

Published Information

G1411 was identified in the sequence of TAC clone K22G18 (GenBankaccession number AB022212).

Experimental Observations

The complete sequence of G1411 was determined. The function of G1411 wasanalyzed using transgenic plants in which this gene was expressed underthe control of the 35S promoter. G1411 overexpressing plants weresmaller than wild-type controls and showed reduced apical dominance:axillary shoots develop prematurely amongst primary rosette leaves,resulting in a bushy plant. G1411 overexpressing plants behaved like thecorresponding wild-type controls in all physiological and biochemicalassays that were performed.

Potential Applications

G1411 or its equivalogs could be used to manipulate plant architecture.

G1451 (SEQ ID NO: 277 and 278)

Published Information

G1451 is ARF8, a member of the ARF class of proteins with a VP1-likeN-terminal domain and a C-terminal domain with homology to Aux/IAAproteins. ARF8, like several other ARFs, contains a glutamine-richcentral domain that can function as a transcriptional activation domain(1). ARF8 was shown to bind to an auxin response element (2). It wasalso shown that a truncated version of ARF8 lacking the DNA bindingdomain but containing the activation domain and the C-terminal domaincould activate transcription on an auxin responsive promoter, presumablythrough interactions with another factor bound to the auxin responseelement (1). ARF8 is closely related in sequence to ARF6 (2).

Experimental Observations

G1451 was expressed throughout the plant, with the highest expression inflowers. Transcripts of G1451 were induced in leaves by a variety ofstress conditions. A line homozygous for a T-DNA insertion in G1451 wasused to determine the function of this gene. The T-DNA insertion ofG1451 is approximately one-fifth of the way into the coding sequence ofthe gene and therefore is likely to result in a null mutation.

As measured by NIR, G1451 knockout mutants had increased total combinedseed oil and seed protein content compared to wild-type plants.

Potential Applications

G1451 or its equivalogs may be used to alter seed oil and proteincontent, which may be very important for the nutritional value andproduction of various food products G1451 or its equivalogs could alsobe used to increase plant biomass. Large size is useful in crops wherethe vegetative portion of the plant is the marketable portion sincevegetative growth often stops when plants make the transition toflowering.

G1543 (SEQ ID NO: 303 and 304)

Published Information

G1543 was identified as a novel homeobox gene within section 3 of 255from the complete sequence of Chromosome II (GenBank accession numberAC005560, released by the Arabidopsis Genome Initiative).

Experimental Observations

The ends of G1543 were determined by RACE and a full-length cDNA wasisolated by PCR from mixed cDNA. The encoded 275 amino acid product wasfound to be a member the HD-ZIP class II group of HD proteins. Thepublic annotation for this gene was incorrect; the protein predicted inthe BAC report was only 162 amino acids in length.

RT-PCR analysis revealed that G1543 was expressed ubiquitously but wasup-regulated in response to auxin applications.

The function of G1543 was analyzed using transgenic plants in which thegene was expressed under the control of the 35S promoter. 35S::G1543Arabidopsis plants exhibited a range of phenotypes; most consistently,however, the plants possessed dark green leaves and an altered branchingpattern that led to a shorter more compact stature. These morphologicalphenotypes, along with the expression data, implicate G1543 as acomponent of a growth or developmental response to auxin.

Biochemical assays reflected the changes in leaf color noted duringmorphological analysis. All three T2 lines examined displayed increasedlevels of leaf chlorophylls and carotenoids. Additionally, one of threelines had a decrease in seed oil combined with an increase in seedprotein. A repeat experiment verified the altered seed oil and proteincomposition in two lines.

Physiological assays identified no clear differences between 35S::G1543and wild-type plants.

Potential Applications

The altered levels of chlorophylls, carotenoids, seed oils, and proteinsthat resulted from overexpression of the gene in Arabidopsis indicatethat G1543 or its equivalogs or its equivalogs might used to manipulatethe composition of these substances in seed, with applications towardthe improvement in the nutritional value of foodstuffs (for example, byincreasing lutein).

Enhanced chlorophyll and carotenoid levels could also improve yield incrop plants. For instance lutein, like other xanthophylls such aszeaxanthin and violaxanthin, is an essential component in the protectionof the plant against the damaging effects of excessive light.Specifically, lutein contributes, directly or indirectly, to the rapidrise of non-photochemical quenching in plants exposed to high light.Crop plants engineered to contain higher levels of lutein couldtherefore have improved photo-protection, possibly leading to lessoxidative damage and better growth under high light. Additionally,elevated chlorophyll levels might increase photosynthetic capacity.

G1543 or its equivalogs might be applied to modify plant stature. Thiscould be used to produce crops that are more resistant to damage by windand rain, or more amenable to harvest. Plants with altered stature mightalso be of interest to the ornamental plant market.

This gene or its equivalogs may also be used to alter oil production inseeds, which may be very important for the nutritional quality andcaloric content of foods

G1792 (SEQ ID NO: 331 and 332)

Published Information

G1792 was identified in the sequence of BAC clone K14B15 (AB025608, geneK14B15.14).

Experimental Observations

G1792 (SEQ ID NO: 331) was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. 35S::G1792plants were more tolerant to the fungal pathogens Fusarium oxysporum andBotrytis cinerea and showed fewer symptoms after inoculation with a lowdose of each pathogen. This result was confirmed in T2 lines. The effectof G1792 overexpression in increasing tolerance to pathogens receivedfurther, incidental confirmation. T2 plants of two 35S::G1792 lines hadbeen growing in a room that suffered a serious powdery mildew infection.For each line, a pot of six plants was present in a flat containing nineother pots of lines from unrelated genes. In either of the two differentflats, the only plants that were free from infection were those from the35S::G1792 line. This observation indicated that G1792 overexpressionmight be used to increase resistance to powdery mildew. Additionalexperiments confirmed that 35S::G1792 plants showed increased toleranceto Erysiphe. G1792 was ubiquitously expressed, but appeared to beinduced by salicylic acid.

35S::G1792 overexpressing plants also showed more tolerance to growthunder nitrogen-limiting conditions. In a root growth assay underconditions of limiting N, 35S::G1792 lines were slightly less stunted.In a germination assay that monitored the effect of C on N signalingthrough anthocyanin production on high sucrose plus and minus glutaminethe 35S::G1792 lines made less anthocyanin on high sucrose plusglutamine, suggesting that the gene can be involved in the plantsability to monitor their carbon and nitrogen status.

G1792 overexpressors were more tolerant to drought conditions thanwild-type plants in soil-based assays.

G1792 overexpressing plants showed several mild morphologicalalterations: leaves were dark green and shiny, and plants bolted,subsequently senesced, slightly later than wild-type controls. Among theT1plants, additional morphological variation (not reproduced later inthe T2 plants) was observed: many showed reductions in size as well asaberrations in leaf shape, phyllotaxy, and flower development.

Potential Applications

G1792 or its equivalogs can be used to engineer pathogen-resistantplants. In addition, it can also be used to improve seedling germinationand performance under conditions of limited nitrogen.

Potential utilities of this gene or its equivalogs also includeincreasing chlorophyll content allowing more growth and productivity inconditions of low light. With a potentially higher photosynthetic rate,fruits could have higher sugar content. Increased carotenoid contentcould be used as a nutraceutical to produce foods with greaterantioxidant capability.

G1792 or its equivalogs could be used to manipulate wax composition,amount, or distribution, which in turn could modify plant tolerance todrought and/or low humidity or resistance to insects, as well as plantappearance (shiny leaves). In particular, it would be interesting to seewhat the effect of increased wax deposition on leaves of a plant likecotton would do to drought resistance or water use efficiency. Apossible application for this gene might be in reducing the wax coatingon sunflower seeds (the wax fouls the oil extraction system duringsunflower seed processing for oil). For this purpose, antisense orco-suppression of the gene in a tissue specific manner might be useful

G1816 (SEQ ID NO: 2142 and 2143)

G1816 corresponds to TRIPTYCHON (TRY), a gene that regulates epidermalcell specification in the leaf and root (Schnittger et al., 1998; 1999;Schellmann et al., 2002). The gene was included in our earlier studiesas a putative paralog of G682 and based on the increased resistance of35S::G1816 lines to osmotic stress conditions such as high levels ofglucose.

The aim of this study was to re-assess 35S::G1816 lines and determinewhether overexpression of the gene could confer enhanced stresstolerance in a comparable manner to G682. We also sought to examinewhether use of a two-component overexpression system would produce anystrengthening of the phenotype relative to the use of a 35S directpromoter-fusion.

We have now generated 35S::G1816 lines using the two component system.These lines showed a strong glabrous phenotype, similar to what wasobserved during our phase I study, and similar to the effect produced byG682 overexpression. However, many of the 35S::G1816 lines were noted tobe smaller than controls, an effect that had not been previouslyrecognized.

Ten of the two component lines were tested in plate based physiologyassays, and all ten lines showed a strong resistance to osmotic stresswhen germinated on sucrose plates. All the lines examined also showedincreased density of root hairs. These effects are broadly comparable tothose observed in G682 lines, suggesting that the two genes likely havevery related functions. However, 35S::G682 lines showed positive resultsin a greater number of assays than the 35S::G1816 lines (see 35S::G682report), perhaps indicating that G682 can protect against a greaterrange of drought related stresses than G1816.

TABLE 23 G1816 35S, 2-components supTfn Germ. Germ. in G682- in highhigh Germ Germ Growth like root Line NaCl mannitol Sucrose ABA in heatin cold in heat Drought Chilling morph. 304 wt wt ++ wt wt wt wt wt wt +306 wt wt + wt wt wt wt wt wt + 311 wt wt ++ wt wt wt wt wt wt + 313 wtwt + wt wt wt wt wt wt + 325 wt wt ++ wt wt wt wt wt wt + 327 wt wt + wtwt wt wt wt wt + 345 wt wt ++ wt wt wt wt wt wt + 350 wt wt + wt wt wtwt wt wt + 351 wt wt + wt wt wt wt wt wt + 353 wt wt ++ wt wt wt wt wtwt +Potential Applications

Based on the tolerance of 35S::G1816 lines to osmotic stress, G682 andrelated sequences such as G1816 are good candidates for use in thealleviation of drought related stress. The strong performance of35S::G1816 lines on plates containing high levels of sugar particularlysuggests that the gene might also be applied to manipulate sugar-sensingresponses. The decrease in size seen in some of the lines, suggests thatoptimization of stress tolerance gene might benefit by use of differentpromoters or protein modifications.

The epidermal phenotypes seen in 35S::G1816 lines indicate that the genecould also be used to modify developmental characters such as theformation of trichomes or root hairs.

G1820 (SEQ ID NO: 341 and 342)

Published Information

G1820 is a member of the Hap5 subfamily of CCAAT-box-bindingtranscription factors. G1820 was identified as part of the BAC cloneMBA10, accession number AB025619 released by the Arabidopsis Genomesequencing project.

Experimental Observations

The complete sequence of G1820 was determined.

In soil-based assays, 35S::G1792 direct fusion overexpressing plantswere significantly more drought tolerant than wild-type control plants

The function of this gene was also analyzed using transgenic plants inwhich G1820 was expressed under the control of the 35S promoter with thetwo-component transformation system. A wide range of morphologicalalterations was observed, similar to those seen in our earlier studies.

In plate-based physiology assays on the two-component lines, many of ourpreviously observed drought stress-related phenotypes were confirmed.The majority of lines were tolerant, to varying extents, in salt,mannitol, sucrose, ABA and cold in germination assays.

It should be emphasized that we have observed stress tolerancephenotypes for several other G481 related genes including G482, G485,G1836, and non-Arabidopsis sequences. The similar effects seen whenthese genes are overexpressed strongly suggest that they are likely befunctionally related, at least with respect to this phenotype.

Overexpression of G1820 also consistently reduced the time to flowering.Under continuous light conditions at 20-25 C, the 35S::G1820transformants displayed visible flower buds several days earlier thancontrol plants. The primary shoots of these plants typically startedflower initiation 1-4 leaf plastochrons sooner than those of wild type.Such effects were observed in all three T2 populations and in asubstantial number of primary transformants.

When biochemical assays were performed, some changes in leaf fames weredetected. In one line, an increase in the percentage of 18:3 and adecrease in 16:1 were observed. G1820 overexpressors behaved similarlyto wild-type controls in other biochemical assays. As determined byRT-PCR, G1820 was highly expressed in embryos and siliques. Noexpression of G1820 was detected in the other tissues tested. G1820expression appeared to be induced in rosette leaves by cold and droughtstress treatments, and overexpressing lines showed tolerance to waterdeficit and high salt conditions.

One possible explanation for the complexity of the G1820 overexpressionphenotype is that the gene is involved in the cross talk between ABA andGA signal transduction pathways. It is well known that seed dormancy andgermination are regulated by the plant hormones abscisic acid (ABA) andgibberellin (GA). These two hormones act antagonistically with eachother. ABA induces seed dormancy in maturing embryos and inhibitsgermination of seeds. GA breaks seed dormancy and promotes germination.It is conceivable that the flowering time and ABA insensitive phenotypesobserved in the G1820 overexpressors are related to an enhancedsensitivity to GA, or an increase in the level of GA, and that thephenotype of the overexpressors is unrelated to ABA. In Arabidopsis, GAis thought to be required to promote flowering in non-inductivephotoperiods. However, the drought and salt tolerant phenotypes wouldindicate that ABA signal transduction is also perturbed in these plants.It seems counterintuitive for a plant with salt and drought tolerance tobe ABA insensitive since ABA seems to activate signal transductionpathways involved in tolerance to salt and dehydration stresses. Oneexplanation is that ABA levels in the G1820 overexpressors are also highbut that the plant is unable to perceive or transduce the signal.

G1820 overexpressors also had decreased seed oil content and increasedseed protein content compared to wild-type plants

TABLE 24 G1820 35S, 2-components-supTfn Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 305 wt wt + ++ wt + wt wt + 310 wt wt wt + wt wt wt wtwt 321 wt wt + + wt + wt wt wt 323 wt wt wt wt wt wt wt wt wt 325 wt wtwt + wt wt wt wt + 326 wt wt + + wt wt wt wt wt 327 wt wt wt wt wt wt wtwt wt 341 ++ + + ++ wt + wt wt wt 343 ++ + + ++ wt wt wt wt wt 352++ + + + wt wt wt wt wtPotential Applications

G1820 affects ABA sensitivity, and thus when transformed into a plantthis transcription factor or its equivalogs may diminish cold, drought,oxidative and other stress sensitivities, and also be used to afterplant architecture, and yield.

The osmotic stress and cold assay results indicate that G481 and relatedsequences, including G1820, could be used to alter a plant's response towater deficit and can be used to engineer plants with enhanced toleranceto drought, salt stress, and freezing.

G1820 or its equivalogs may also be used to increase a plant's toleranceto cold.

G1820 or its equivalogs could also be used to accelerate flowering time.

G1820 or its equivalogs may be used to modify levels of saturation inoils.

G1820 or its equivalogs may be used to seed protein content.

The promoter of G1820 could be used to drive seed-specific geneexpression.

Potential Applications

G1820 or equivalog overexpression may be used to alter seed proteincontent, which may be very important for the nutritional value andproduction of various food products

G1836 (SEQ ID NO: 343 and 344)

Published Information

G1836 was identified in the sequence of BAC F14123, GenBank accessionnumber AC007399, released by the Arabidopsis Genome Initiative.

Experimental Observations

The complete sequence of G1836 was determined. The function of this genewas analyzed using transgenic plants in which G1836 was expressed underthe control of the 35S promoter. Morphologically, the plants weresomewhat paler than the wild-type controls. This observation did nottranslate into a detectable difference in the chlorophyll a orchlorophyll b content in these transgenics (see biochemistry data).Overexpression of G1836 affected the plants' ability to tolerate highconcentrations of salt in a germination assay. All of the lines showedgreater expansion of the cotyledons when seeds are germinated on MSmedia containing high concentrations of NaCl, indicating they had moretolerance to salt stress compared to the wild-type controls. There wasno enhanced tolerance to high salt in older seedlings in a root growthassay. This was not unexpected because salt tolerance in the twodevelopmental stages in often uncoupled in nature indicating mechanisticdifferences.

G1836 overexpression also resulted in plants that were more droughttolerant than wild-type control plants.

Expression of G1836 was also repressed by Erysiphe orontii infection.

Seven of ten lines tested in a recent series of plate-based assaysshowed enhanced abiotic stress tolerance, as indicated in Table 25.These results included six lines with improved salt tolerance, andseveral of the lines showed altered sugar sensing in sucrose- andmannitol-based assays, less sensitivity to ABA, and improved toleranceto cold in germination and chilling assays.

TABLE 25 G1836 35S, 2 components-supTfn Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 306 + wt + ++ wt wt wt wt + 362 wt wt wt wt wt wt wt wtwt 363 wt wt wt wt wt wt wt wt wt 365 wt wt wt wt wt wt wt wt wt 367 wtwt wt wt wt ++ wt wt + 381 + wt + + wt wt wt wt + 383 + wt wt wt wt wtwt wt + 384 + wt + + wt wt wt wt wt 385 + + + wt wt wt wt wt wt 386 +wt + + wt wt wt wt wtPotential Applications

The results of these abiotic stress assays indicate that G1836 can beused to increase plant tolerance to drought, soil salinity and coldconditions during germination or at the seedling stage. The results ofthese studies confirm our earlier conclusions that G481 and its relatedgenes, including G1836, are excellent candidates for improvement ofdrought and cold-related stress tolerance in plants.

G1930 (SEQ ID NO: 369 and 370)

Published Information

G1930 was identified in the sequence of P1 clone K13N2 (gene K13N2.7,GenBank protein accession number BAA95760).

Experimental Observations

G1930 was ubiquitously expressed and did not appear to be induced by anyof the conditions tested.

We generated 35S::G1930 lines via both the direct promoter-fusion andthe 2-component methods. Both types of lines exhibited a variety ofmorphological phenotypes including reduced size, slow growth, andalterations in leaf orientation. In some lines, changes in leaf shape,hypocotyl length, trichome density, flowering time and non-specificfloral abnormalities that reduced fertility were also observed.

We tested both two component and direct promoter-fusion lines under avariety of plate based treatments. Comparable results were obtained withboth types of lines, but the 2-component lines generally showed strongerphenotypes, suggesting perhaps, that this system afforded anamplification of G1930 activity relative to the direct promoter-fusion.All twenty of the lines tested gave positive results in one or more ofthe stress treatments, including, sucrose, NaCl, ABA, cold germinationand cold growth. Particularly strong resistance was seen in the NaCl andsucrose germination assays.

An increase in the amount of chlorophylls a and b in seeds of two T2lines was detected.

We have obtained comparable developmental effects as well as a strongenhancement of drought related stress tolerance from all four of theArabidopsis genes in the G867 study group (G9, G867, G993, and G1930).The almost identical phenotypic effects produced by these genes stronglysuggest that they are functionally equivalent.

TABLE 26 G1930 35S, Direct promoter-fusion and 2-components-supTfn Germ.Germ. in in high high Germ Germ Growth Line Transformation NaCl mannitolSucrose ABA in heat in cold in heat Drought Chilling 304 Direct + wt wtwt wt wt wt wt wt fusion 305 Direct wt wt ++ wt wt + wt wt wt fusion 306Direct + wt wt wt wt wt wt wt + fusion 308 Direct wt wt ++ wt wt wt wtwt + fusion 309 Direct wt wt wt wt wt wt wt wt + fusion 311 Direct wtwt + wt wt + wt wt + fusion 365 Direct + wt ++ wt wt wt wt wt wt fusion367 Direct + wt ++ wt wt wt wt wt + fusion 369 Direct + wt wt wt wt wtwt wt + fusion 370 Direct + wt wt wt wt wt wt wt + fusion 321 2-comp +wt ++ wt wt wt wt wt + 322 2-comp + wt + + wt wt wt wt + 324 2-comp +wt + + wt wt wt wt + 327 2-comp + wt + wt wt wt wt wt wt 329 2-comp +wt + wt wt wt wt wt + 331 2-comp + wt ++ + wt wt wt + 332 2-comp +wt + + wt wt wt wt wt 334 2-comp + wt wt wt wt wt wt wt wt 336 2-comp +wt ++ + wt wt wt wt + 339 2-comp + wt wt wt wt wt wt wt wt 331 2-comp +Potential Applications

Based on the results of our overexpression studies, G867 and relatedsequences such as G1930 are excellent candidate genes for improvement ofabiotic stress tolerance in commercial plant species. The morphologicaleffects associated with their overexpression indicate that tissuespecific or conditional promoters might be used to optimize the utilityof these genes.

This gene could also be used to regulate the levels of chlorophyll inseeds.

G2053 (SEQ ID NO: 389 and 390)

Published Information

G2053 was identified in the sequence of BAC T27C4, GenBank accessionnumber AC022287, released by the Arabidopsis Genome Initiative.

Experimental Observations

The function of G2053 was analyzed using transgenic plants in which thegene was expressed under the control of the 35S promoter. In a rootgrowth assay on media containing high concentrations of PEG, G2053overexpressors showed more root growth compared to wild-type controls.G2053 overexpressors also were significantly more drought tolerant thanwild-type control plants.

Potential Applications

G2053 or its equivalogs could be used to alter a plant's response waterdeficit conditions and, therefore, could be used to engineer plants withenhanced tolerance to drought, salt stress, and freezing.

G2133 (SEQ ID NO: 407 and 408)

G2133 corresponds to gene F26A9.11 (AAF23336).

Experimental Observations

The function of G2133 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter.

G2133 expression was detected in a variety of tissues: flower, leaf,embryo, and silique samples. Its expression might be altered by severalconditions, including auxin treatment, osmotic stress, and Fusariuminfection. Overexpression of G2133 caused a variety of alterations inplant growth and development: delayed flowering, altered inflorescencearchitecture, and a decrease in overall size and fertility. G2133 plantswere more tolerant to glyphosate than wild-type control plants.

At early stages, 35S::G2133 transformants were markedly smaller thancontrols and displayed curled, dark-green leaves. Most of these plantsremained in a vegetative phase of development substantially longer thancontrols, and produced an increased number of leaves before bolting. Inthe most severely affected plants, bolting occurred more than a monthlater than in wild type (24-hour light). In addition, the plantsdisplayed a reduction in apical dominance and formed large numbers ofshoots simultaneously, from the axils of rosette leaves. Theseinflorescence stems had short internodes, and carried increased numbersof cauline leaf nodes, giving them a very leafy appearance. Thefertility of 35S::G2133 plants was generally very low. In addition,G2133 overexpressing lines were found to be more resistant to theherbicide glyphosate in initial and repeat experiments.

G2133 is a paralog of G47, the latter having been known from earlierstudies to confer a drought tolerance phenotype when overexpressed. Itwas thus not surprising when G2133 was also shown to induce droughttolerance in a number of 35S::G2133 lines challenged in soil-baseddrought assays. After re-watering, all of the plants of both G2133overexpressor lines became reinvigorated, and all of the control plantsdied or were severely affected by the drought treatment.

Potential Applications

G2133 and its equivalogs can be used to increase the tolerance of plantsto drought and to other osmotic stresses. G2133 could also be used forthe generation of glyphosate resistant plants, and to increase plantresistance to oxidative stress and glyphosate.

G2153 (SEQ ID NO: 417 and 418)

Published Information

The sequence of G2153 was obtained from Arabidopsis genomic sequencingproject, GenBank accession number AC011437, based on its sequencesimilarity within the conserved domain to other AT-hook related proteinsin Arabidopsis. G2153 corresponds to gene F7018.4 (AAF04888).

Experimental Observations

The complete sequence of G2153 was determined. G2153 was stronglyexpressed in roots, embryos, siliques, and germinating seed, but at lowor undetectable levels in shoots, flowers, and rosette leaves. It wasnot significantly induced or repressed by any condition tested.

The function of this gene was analyzed using transgenic plants in whichG2153 was expressed under the control of the 35S promoter.Overexpression of G2153 in Arabidopsis resulted in seedlings with analtered response to osmotic stress. In a germination assay on mediacontaining high sucrose, G2153 overexpressors had more expandedcotyledons and longer roots than the wild-type controls. This phenotypewas confirmed in repeat experiments on individual lines, and all threelines showed osmotic tolerance. Increased tolerance to high sucrosecould also be indicative of effects on sugar sensing. Overexpression ofG2153 produced no consistent effects on Arabidopsis morphology, and noaltered phenotypes were noted in any of the biochemical assays.

Potential Applications

G2153 or its equivalogs can be improve a plant's response to drought,salt stress, and freezing.

G2153 or its equivalogs may also be useful for altering a plant'sresponse to sugars.

G2155 (SEQ ID NO: 419 and 420)

Published Information

The sequence of G2155 was obtained from Arabidopsis genomic sequencingproject, GenBank accession number AC012188.

Experimental Observations

The complete sequence of G2155 was determined. G2155 expression wasdetected at low levels only in flowers and embryos. It was not inducedin rosette leaves by any condition tested.

The function of this gene was analyzed using transgenic plants in whichG2155 was expressed under the control of the 35S promoter.Overexpression of G2155 produced a marked delay in the time toflowering. Under continuous light conditions, 35S::G2155 transformantsdisplayed a considerable extension of vegetative development, andtypically formed flower buds about two weeks later than wild-typecontrols. At early stages, the plants were slightly small and had ratherrounded leaves compared to wild type. However, later in development,when the leaves were fully expanded, 35S::G2155 plants became verylarge, dark-green, and senesced much later than controls.

In addition, overexpression of G2155 resulted in an increase in seedglucosinolate M39497 in two T2 lines. No other phenotypic alterationswere observed in any of the biochemical or physiological assays.

Potential Applications

G2155 or equivalog overexpression may be used to delay flowering.

G2155 or its equivalogs could also be used to alter seed glucosinolatecomposition.

G2345 (SEQ ID NO: 2171 and 2172)

G2345 is a putative paralog of G481, and is a member of the HAP3subgroup of the CCAAT transcription factor family. During our earlierprogram, we found that G2345-overexpressing plants were similar towild-type plants in all assays performed. The aim of this study was toreassess the role of G2345 in drought stress related tolerance viatwo-component overexpression.

We have now generated 35S lines for G2345 using the two componentsystem; three batches of T1 lines were obtained, including lines301-302, 341-347, and 381-400. Some size variation was apparent in thesecond batch of plants (341-347), but otherwise, no consistentdifferences in morphology were observed compared to wild-type controls.

Two of two lines of overexpressors tested showed better germination incold conditions than did wild-type control plants.

TABLE 27 G2345 35S, 2-components supTfn Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 301 wt wt wt wt wt wt wt 302 wt wt wt wt wt wt wt 341wt wt wt wt wt wt wt 386 wt wt wt wt wt wt wt 387 wt wt wt wt wt wt wt389 wt wt wt wt wt wt wt 390 wt wt wt wt wt wt wt 392 wt wt wt wt wt wtwt 393 wt wt wt wt wt + wt wt 400 wt wt wt wt wt + wt wtPotential Applications

The results of the cold germination assay confirm that G481 and itsrelated sequences, including G2345, are excellent candidates forimproving abiotic stress tolerance in plants.

G2509 (SEQ ID NO: 439 and 440)

Published Information

G2509 corresponds to gene T2I1_(—)20 (CAB87920).

Experimental Observations

G2509 (SEQ ID NO: 439) was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. Overexpressionof G2509 caused multiple alterations in plant growth and development,most notably, altered branching patterns, and a reduction in apicaldominance, giving the plants a shorter, bushier stature than wild type.Twenty 35S::G2509 primary transformants were examined; at early stagesof rosette development, these plants displayed a wild-type phenotype.However, at the switch to flowering, almost all T1 lines showed a markedloss of apical dominance and large numbers of secondary shoots developedfrom axils of primary rosette leaves. In the most extreme cases, theshoots had very short internodes, giving the inflorescence a very bushyappearance. Such shoots were often very thin and flowers were relativelysmall and poorly fertile. At later stages, many plants appeared verysmall and had a low seed yield compared to wild type. In addition to theeffects on branching, a substantial number of 35S::G2509 primarytransformants also flowered early and had buds visible several daysprior to wild type. Similar effects on inflorescence development werenoted in each of three T2 populations examined. The branching and plantarchitecture phenotypes observed in 35S::G2509 lines resemble phenotypesobserved for three other AP2/EREBP genes: G865, G1411, and G1794, G2509,G865, and G1411 form a small clade within the large AP2/EREBP family,and G1794, although not belonging to the clade, is one of the AP2/EREBPgenes closest to it in the phylogenetic tree. It is thus likely that allthese genes share a related function, such as affecting hormone balance.

G2509 overexpressing plants had increased seed protein compared towild-type control plants.

Overexpression of G2509 in Arabidopsis resulted in an increase inalpha-tocopherol in seeds in two T2 lines. G2509 was ubiquitouslyexpressed in Arabidopsis plant tissue. G2509 expression levels werealtered by a variety of environmental or physiological conditions.

Potential Applications

G2509 or its equivalogs can be used to manipulate plant architecture anddevelopment, alter tocopherol composition, and flowering time.

G2583 (SEQ ID NO: 449 and 450)

Published Information

G2583 corresponds to gene F2I11_(—)80 (CAB96654).

Experimental Observations

35S::G2583 plants exhibited extremely glossy leaves. At early stages,35S::G2583 seedlings appeared normal, but by about two weeks aftersowing, the plants exhibited very striking shiny leaves, which wereapparent until very late in development. Many lines displayed a varietyof other effects such as a reduction in overall size, narrow curledleaves, or various non-specific floral abnormalities, which reducedfertility. These effects on leaf appearance were observed in 18 of 20primary transformants, and in all the plants from 4 of 6 T2 lines. Theglossy nature of the leaves may be a consequence of changes inepicuticular wax content or composition. G2583 belongs to a small ladewithin the large AP2/EREBP Arabidopsis family that also contains G975,G1387, and G977. G975 overexpression causes a substantial increase inleaf wax and a morphology resembling that of 35S::G2583 plants. G2583was ubiquitously expressed at higher levels in root, flower, embryo, andsiliques.

Potential Applications

G2583 or its equivalogs can be used to modify plant appearance byproducing shiny leaves. In addition, it or its equivalogs can be used tomanipulate wax composition, amount, or distribution, which in turn canmodify plant tolerance to drought and/or low humidity or resistance toinsects.

G2718 (SEQ ID NO: 2191 and 2192)

G2718 was included in our earlier studies as a putative paralog of G682.A genetic analysis of G2718 function has not yet been published. The aimof this study was to re-assess 35S::G2718 lines and determine whetheroverexpression of the gene could confer enhanced stress tolerance in acomparable manner to G682. We also sought to examine whether use of atwo-component overexpression system would produce any strengthening ofthe phenotype relative to the use of a 35S direct promoter-fusion.

We have now generated 35S::G2718 lines using the two component system.These lines showed a strong glabrous phenotype, similar to what wasobserved during our earlier studies, and similar to the effect producedby G682 overexpression. Almost all of the lines tested were moretolerant to high sucrose in an osmotic stress assay, and two lines werefound to be insensitive to ABA.

TABLE 28 G1816 35S, 2-components supTfn Germ. Germ. in G682- in highhigh Germ Germ Growth like root Line NaCl mannitol Sucrose ABA in heatin cold in heat Drought Chilling morph. 341 wt wt wt + wt wt wt wt 342wt wt + + wt wt wt + 425 wt wt + wt wt wt wt + 427 wt wt + wt wt wt wt +428 wt wt wt wt wt wt wt + 429 wt wt + wt wt wt wt + 431 wt wt + wt wtwt wt + 432 wt wt + wt wt wt wt ++ 433 wt wt + wt wt wt wt wt 439 wtwt + wt wt wt wt +Potential Applications

Based on the tolerance of 35S::G1816 lines to osmotic stress, G682 andrelated sequences such as G1816 are good candidates for use in thealleviation of drought related stress. The strong performance of35S::G1816 lines on plates containing high levels of sugar particularlysuggests that the gene might also be applied to manipulate sugar-sensingresponses. However, the decrease in size seen in some of the lines,suggests that the gene might require optimization by use of differentpromoters or protein modifications, prior to product development.

The epidermal phenotypes seen in 35S::G1816 lines indicate that the genecould also be used to modify developmental characters such as theformation of trichomes or root hairs.

Rice G3377 (SEQ ID NO: 1221 and 2934)

G3377 is a rice gene that was identified as being a putative ortholog ofG912. The aim of this project was to determine whether G3377 has anequivalent function to G912 via the analysis of 35S::G3377 Arabidopsislines.

35S::G3377 overexpressors were obtained at relatively low frequency. Thelines that were recovered showed a number of striking morphologicaleffects including a reduction in size, dark coloration and delayedflowering. Such features were somewhat comparable to those shown by35S::G912 lines, indicating that the two genes likely have a similarfunction.

Lines of plants overexpressing these rice sequences were shown to haveincreased germination in high salt, mannitol, sucrose and hotconditions, as seen in the following table.

TABLE 29 G3377 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 382 + + + wt wt 383 wt + + wt wt 384 wt wt wt wt + 385wt wt wt wt wt 386 wt wt + wt wt 388 wt wt wt wt wt 389 wt wt wt wt +390 wt wt wt wt wt 391 wt wt wt wt wt 392 wt wt wt wt wtPotential Applications

Given the comparable morphological effects of G3377 and G912overexpression, it is probable that the genes have comparable roles.

The delayed flowering exhibited by 35S::G3377 lines suggest that thegene might also be applied to modify flowering time; in particular, anextension of vegetative growth can significantly increase biomass andresult in substantial yield increases. In some species (for examplesugar beet), where the vegetative parts of the plant constitute thecrop, it would be advantageous to delay or suppress flowering in orderto prevent resources being diverted into reproductive development.Additionally, delaying flowering beyond the normal time of harvest couldalleviate the risk of transgenic pollen escape from such crops.

The results of the abiotic stress assays confirm that G912 and itsrelated sequences, including the rice G3377 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Rice G3392 (SEQ ID NO: 1082 and 2920)

G3392 is a rice gene that was identified as being a putative ortholog ofG682. The aim of this project was to determine whether G3392 has anequivalent function to the G682-related genes from Arabidopsis via theanalysis of 35S::G3392 Arabidopsis lines.

We have now generated 35S::G3392 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a reduction in overall size. Such similaritiesin phenotypes suggest that the genes have similar functions.Interestingly, many of the 35S::G3392 lines also produced pale yellowseed, which likely indicated a reduction in anthocyanin levels in theseed coat. Such an effect was not observed 35S::G682 seed, but G682 andits paralogs were found during our genomics studies to inhibitanthocyanin production.

TABLE 30 G3392 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 301 wt wt wt wt wt + wt + + + 305 wtwt + wt wt + wt + + + 306 + wt + wt wt + − + + + 321 wt wt + wt wt +wt + + + 322 + wt wt wt wt + − ND + + 341 wt + + wt wt + wt + + +342 + + ++ wt wt + wt + + + 346 wt + + wt wt + wt + + + 347 wt wt wt wtwt + wt + + + 348 wt wt + wt wt + wt + + +Potential Applications

The results of these abiotic stress assays confirm that G682 and itsrelated sequences, including the rice G3392 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

The effect of G3392 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3392 plants could indicate that G3392 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Rice G3393 (SEQ ID NO: 559 and 2921)

G3393 is a putative rice ortholog of G682. 35S::G3393 lines plantsshowed comparable morphological effects to 35S::G682 lines and exhibiteda glabrous phenotype combined with a reduction in overall size. Thesesimilarities in phenotypes suggest that the genes have similarfunctions. Many of the 35S::G3393 lines also produced pale yellow seed,which likely indicated a reduction in anthocyanin levels in the seedcoat. Such an effect was not observed 35S::G682 seed, but we did findthat G682 and its paralogs inhibited anthocyanin production.

TABLE 31 G3393 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 305 wt wt + wt wt wt wt ++ + 307 wtwt wt wt wt wt wt + + 308 wt wt wt wt wt wt wt ++ + 323 wt wt wt wt wtwt wt wt + + 324 wt wt wt wt wt wt wt wt + + 326 wt wt wt wt wt wt wtwt + + 327 wt wt wt wt wt wt wt wt ++ + 328 wt wt wt wt wt wt wt wt ++ +331 wt wt wt wt wt wt wt wt wt + 333 wt wt wt wt wt wt wt wt + +Potential Applications

The results of the cold germination and sucrose assays confirm that G682and its related sequences, including the rice G3393 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

The effect of G3393 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3393 plants could indicate that G3393 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Rice G3395 (SEQ ID NO: 790 and 2910)

G3395 is an ortholog of G481 from Oryza sativa. G3395 is a member of theHAP3 subgroup of the CCAAT-box binding transcription factor family andcorresponds to OsHAP3A and has been recently been shown to influencechloroplast biogenesis (Miyoshi et al. (2003) Plant J. 36: 532-540). Wehave previously indicated that overexpression of G3395 conferred a salttolerant phenotype in Arabidopsis plants (U.S. patent application Ser.No. 10/675,852, filed Sep. 30, 2003).

Experimental Observations

The aim of the present study was to assess the role of G3395 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

The majority of 35S::G3395 plants showed a very slight acceleration inflowering time, but a single line showed slightly late flowering. OneG3395-overexpressing line was shown to have improved tolerance todrought in a plate-based assay. The same line also showed increasedtolerance to high salt concentration relative to wild-type ornon-transformed controls, confirming the prior observed result.

TABLE 32 G3395 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 301 wt wt wt wt wt wt wt wt wt 302 + wt wt wt wt wtwt + wt 303 wt wt wt wt wt wt wt wt wt 304 wt wt wt wt wt wt wt wt wt305 wt wt wt wt wt wt wt wt wt 306 wt wt wt wt wt wt wt wt wt 307 wt wtwt wt wt wt wt wt wt 308 wt wt wt wt wt wt wt wt wt 309 wt wt wt wt wtwt wt wt wt 310 wt wt wt wt wt wt wt wt wtPotential Applications

The results of this drought assay conform that G481 and its relatedsequences, including the rice sequence G3395, are excellent candidatesfor improving abiotic stress tolerance in plants.

Rice G3397 (SEQ ID NO: 794 and 2911)

G3397, an ortholog of G481 from Oryza saliva, is a member of the HAP3subgroup of the CCAAT-box binding transcription factor family. This geneis phylogenetically most closely related to G485, another member of theHAP3 subgroup. G3397 corresponds to OsHAP3C and has been recently beenshown to influence chloroplast biogenesis Miyoshi et al. (2003) supra).

Experimental Observations

The aim of this study was to assess the role of G3397 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

35S::G3397 plants showed a 1-2 week acceleration in flowering time,compared to wild-type or non-transformed plants. This same phenotype wasalso noted for the most closely related Arabidopsis gene G485. Theseearly flowering lines also were smaller than controls. Ten lines weretested in plate-based physiology assays. One of these lines showedinsensitivity to ABA, and another germinated better than wild-type ornon-transformed plants in hot conditions.

TABLE 33 G3397 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 322 wt wt wt wt wt 323 wt wt wt wt wt 324 wt wt wt wtwt 326 wt wt wt + wt 328 wt wt wt wt wt 330 wt wt wt wt wt 334 wt wt wtwt wt 335 wt wt wt wt wt 338 wt wt wt wt wt 339 wt wt wt wt +Potential Applications

The results of the heat and ABA germination assays confirm that G481 andits related sequences, including the rice sequence G3397, are excellentcandidates for improving abiotic stress tolerance in plants.

Rice G3398 (SEQ ID NO: 796 and 2912)

G3398 from Oryza sativa is related to G481, and phylogenetically mostclosely related to G485. Like G481 and G485, G3398 is a member of theHAP3 subgroup of the CCAAT-box binding transcription factor family.

Experimental Observations

The aim of this study was to assess the role of G3398 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

35S::G3398 plants showed a 1-2 week acceleration in flowering time,compared to wild-type or non-transformed plants. This same phenotype wasalso noted for the most closely related Arabidopsis gene, G485. Theseearly flowering lines also were smaller than controls.

In the most recent study, six lines overexpressing G3398 have thus farbeen tested in physiological assays. One line was shown to germinatebetter in heat than wild-type or non-transformed controls.

TABLE 34 G3398 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 303 wt wt wt wt + 323 wt wt wt wt wt 324 wt wt wt wt wt329 wt wt wt wt wt 332 wt wt wt wt wt 335 wt wt wt wt wtPotential Applications

The results of the heat germination assay confirm that G481 and itsrelated sequences, including the rice G3398 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Rice G3399 (SEQ ID NO: 1399 and 2937)

G3399 identified as a putative rice ortholog of G1073. Phylogeneticanalysis identifies G3399 along with G3400 as being the most closelyrelated orthologs to G1073. The aim of this project is to determinewhether overexpression of G3399 in Arabidopsis produces comparableeffects to those of G1073 overexpression.

35S::G3399 lines have been obtained containing either of two differentconstructs. Both constructs produced similar morphological phenotypes;many of the lines were small at early stages, showed alterations in leafshape, and had slightly delayed flowering. However a significant numberof lines developed enlarged lateral organs (leaves and flowers),particularly at later stages.

It is noteworthy that one of the constructs (P21269) contained an aminoacid conversion (proline to a glutamine at residue 198, in a conserveddomain) relative to the native protein. Lines for this mutated proteinshowed fewer undesirable morphologies than the wild type version.

The morphologically similar effects caused by overexpression of thisrice gene versus G1073 and the Arabidopsis paralogs, suggest that theylikely have related functions.

TABLE 35 G3399 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 321 wt wt wt wt wt wt wt wt wt 322 wt wt wt wt wt wt wtwt wt 323 wt wt wt wt wt wt wt wt wt 325 wt wt wt wt ++ wt wt wt wt 330wt wt wt wt wt wt wt wt wt 331 wt wt wt wt wt wt wt wt wt 332 wt wt wtwt wt wt wt wt wt 336 wt wt wt wt wt wt wt wt wt 338 wt wt wt wt + wt wtwt wt 340 wt wt wt wt wt wt wt wt wt 347 wt wt wt wt wt wt wt wt + 348wt wt wt wt wt wt wt wt +Potential Applications

The morphological phenotype induced by G3399 indicates that the genecould be used to modify traits such as organ size and flowering time.This study also identified a specific region of the G3399 protein thatmight be modified in order to optimize the acquisition of desirablephenotypes. In cases where the increased size conferred by G3399overexpression would be undesirable, the morphological changes caused byG3440 overexpression may optimized by the use of, for example, inducibleor tissue specific promoters.

The results of the heat germination and chilling assays conform thatG1073 and its related sequences, including the rice G3399 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

Rice G3429 (SEQ ID NO: 2945 and 2946)

G3429 from Oryza sativa is a gene related to G481. From our phylogeneticanalysis, G3429 appears to be more distantly related to G481 than theother non-Arabidopsis genes in this study (FIG. 3). The gene encodes aprotein corresponding to OsNF-YB1 and has been shown to form a ternarycomplex with a MADS protein OsMADS18 Masiero et al. (2002) J. Biol.Chem. 277: 26429-26435).

Experimental Observations

The aim of this project was to assess the role of G3429 in droughtstress-related tolerance, and compare the effects with those of the G481related genes.

Out of twenty 35S::G3429 T1 plants examined, six were notably lateflowering and had narrow leaves compared to wild-type or non-transformedplants.

Six of ten G3429-overexpressing lines were more tolerant to high saltconditions than wild-type or non-transformed plants.

TABLE 36 G3429 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 301 wt wt wt wt wt wt wt wt wt 302 + wt wt wt − wt wtwt wt 304 + wt wt wt wt wt wt wt wt 305 + wt wt wt wt wt wt wt wt 308 wtwt wt wt wt wt wt wt wt 310 wt wt wt wt wt wt wt wt wt 311 wt wt wt wtwt wt wt wt wt 312 + wt wt wt wt wt wt wt wt 313 + wt wt wt wt wt wt wtwt 319 + wt wt wt wt wt wt wt wtPotential Applications

The results of the heat germination and chilling assays confirm thatG481 and its related sequences, including the rice G3429 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

Corn G3431 (SEQ ID NO: 1089, 556 and 2922)

G3431 is a maize gene that was identified as being a putative orthologof G682. The aim of this project was to determine whether G3431 has anequivalent function to the G682-related genes from Arabidopsis via theanalysis of 35S::G3431 Arabidopsis lines.

We have now generated 35S::G3431 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a reduction in overall size. These similaritiesin phenotypes suggest that the genes have similar functions.Interestingly, some of the 35S::G3431 lines also produced pale yellowseed, which likely indicated a reduction in anthocyanin levels in theseed coat. Such an effect was not observed 35S::G682 seed, but G682 andits paralogs were found during earlier genomics studies to inhibitanthocyanin production.

TABLE 37 G3431 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 303 wt wt wt wt + wt wt wt ++ + 306wt wt wt wt + wt wt wt wt − 321 wt wt + wt wt wt wt wt + + 322 wt wt wtwt wt wt wt wt + + 325 wt wt + wt wt wt wt wt + + 327 wt wt + wt wt wtwt wt wt + 328 wt wt + wt wt wt wt wt + + 331 wt wt wt wt wt wt wt wt wtwt 333 wt wt wt wt wt wt wt wt wt wt 340 wt wt wt wt wt wt wt wt wt +Potential Applications

The results of these abiotic stress assays confirm that G682 and itsrelated sequences, including the corn G3431 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

The effect of G3431 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3431 plants could indicate that G3431 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Corn G3434 (SEQ ID NO: 806, and 2913)

G3434 is an ortholog of G481 and G482 from Zea mays, and is a member ofthe HAP3 subgroup of the CCAAT-box binding transcription factor family.Among the Arabidopsis paralogs in the G481/G482 study group, G3434 isphylogenetically most closely related to G481.

Experimental Observations

The aim of this study is to assess the role of G3434 in droughtstress-related tolerance via overexpression, and compare the effect withthat of the other G481/G482 orthologs and paralogs. As seen in Table 38,35S::G3434 lines showed enhanced salt tolerance and improved sugarsensing as compared to wild-type or non-transformed plants.

TABLE 38 G3434 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 421 wt wt wt wt wt 422 + wt wt wt wt 423 wt wt wt wt wt424 wt wt wt wt wt 426 wt wt wt wt wt 429 + + + wt wt 432 wt wt wt wt wt434 + + + wt wt 435 wt wt wt wt wt 436 wt wt wt wt wtPotential Applications

The results of these salt, mannitol and sucrose stress assays confirmthat G481 and its related sequences, including the corn G3434 sequence,are excellent candidates for improving abiotic stress tolerance inplants.

Corn G3436 (SEQ ID NO: 805 and 2914)

G3436 from Zea mays is a putative ortholog of G481, and is a member ofthe HAP3 subgroup of the CCAAT-box binding transcription factor family.Among the Arabidopsis paralogs in the G481 study group, this gene isphylogenetically most closely related to G485.

Experimental Observations

The aim of this study was to assess the role of G3436 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

Twenty lines of 35S::G3436 plants showed accelerated flowering time byabout 1 week compared to wild-type or non-transformed plants. This samephenotype was also noted for the most closely related Arabidopsis gene,G485. Many of these early flowering lines also were smaller thancontrols.

As seen in Table 39, G3436 overexpressors performed better than controlsin a significant number of assay conditions, including high salt, heatgermination, cold germination, and growth in cold.

TABLE 39 G3436 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 301 wt wt wt wt + + wt wt + 302 wt wt wt wt wt wt wt wtwt 304 wt wt wt wt + wt wt wt + 305 wt wt wt wt + wt wt wt wt 308 wt wtwt wt wt wt wt wt wt 309 wt wt wt wt + + wt wt wt 312 wt wt wt wt + wtwt wt wt 313 wt wt wt wt wt wt wt wt wt 314 + wt wt wt wt wt wt wt wt315 wt wt wt wt + wt wt wt wtPotential Applications

The results of the salt, heat, cold germination and chilling assaysconfirm that G481 and its related sequences, including the corn G3436sequence, are excellent candidates for improving abiotic stresstolerance in plants.

Corn G3440 (SEQ ID NO: 1246, 1213 and 2936)

G3440 is a maize gene that was identified as being a putative orthologof G912. The aim of this project was to determine whether G3440 has anequivalent function to G912 via the analysis of 35S::G3440 Arabidopsislines.

35S::G3440 lines were small, dark in coloration, and flowered later thancontrol lines. These effects were very similar to those produced byoverexpression of G912 in Arabidopsis.

TABLE 40 G3440 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 302 wt wt wt wt wt wt wt wt wt 306 wt wt wt wt wt wt wtwt wt 308 wt wt wt wt wt wt wt wt + 309 wt wt wt wt wt wt wt wt wt 310wt wt wt wt wt wt wt wt wt 311 wt wt wt wt wt wt wt wt wt 312 wt wt wtwt wt wt wt wt wt 314 wt wt wt wt wt wt wt wt wt 317 wt wt wt wt wt wtwt wt wt 320 wt wt wt wt wt wt wt wt wtPotential Applications

Given the comparable morphological effects of G3440 and G912overexpression, it is probable that the genes have comparable roles. Thedevelopmental changes caused by G3440 overexpression suggest that thegene would benefit from optimization with, for example, the use ofinducible or tissue specific promoters.

The results of the chilling assay confirm that G912 and its relatedsequences, including the corn G3440 sequence, are excellent candidatesfor improving abiotic stress tolerance in plants.

Soy G3445 (SEQ ID NO: 1083 and 2915)

G3445 is a soy gene that was identified as a putative ortholog of G682.

We have now generated 35S::G3445 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a slight reduction in overall size. Thesesimilarities in phenotypes suggest that the genes have similarfunctions.

TABLE 41 G3445 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 301 wt wt wt + wt wt wt wt wt wt 302wt wt wt + wt wt wt wt wt + 303 wt wt wt + wt wt wt wt wt wt 321 wt wtwt wt wt wt wt wt wt − 323 wt wt wt + wt wt wt wt wt wt 341 wt wt wt wtwt wt wt wt wt + 342 wt wt wt wt wt wt wt + wt wt 344 wt wt wt wt wt wtwt wt wt + 345 wt wt wt wt wt wt wt wt wt wt 347 wt wt wt wt wt wt wt +wt wtPotential Applications

The results of the ABA and drought assays, confirm that G682 and itsrelated sequences, including the soy G3445 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

The effect of G3445 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation.

Soy G3448 (SEQ ID NO: 1087, 553 and 2917)

G3448 is a soy gene that was identified as being a putative ortholog ofG682. The aim of this project was to determine whether G3448 has anequivalent function to the G682-related genes from Arabidopsis via theanalysis of 35S::G3448 Arabidopsis lines.

We have now generated 35S::G3448 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a reduction in overall size. These similaritiesin phenotypes suggest that the genes have similar functions.Additionally the 35S::G3448 lines showed a somewhat lighter colorationthan controls, perhaps indicating that levels of pigments such asanthocyanins were reduced in leaf tissue.

TABLE 42 G3448 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 302 wt wt wt wt wt wt wt wt + + 303wt wt wt wt wt wt wt wt wt + 305 wt wt wt wt wt wt wt wt + + 308 wt wtwt wt wt wt wt + wt + 309 wt wt wt wt wt wt wt wt wt + 310 wt wt wt wtwt wt wt wt wt + 313 wt wt wt wt wt wt wt wt wt + 314 wt wt wt wt wt wtwt wt wt + 315 wt wt wt wt wt wt wt wt + + 317 wt wt wt wt wt wt wt wtwt +Potential Applications

The results of these drought and chilling stress assays confirm thatG682 and its related sequences, including the soy G3448 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

The effect of G3448 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3448 plants could indicate that G3448 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Soy G3449 (SEQ ID NO: 1088, 554 and 2918)

G3449 is a soy gene that was identified as being a putative ortholog ofG682. The aim of this project was to determine whether G3449 has anequivalent function to the G682-related genes from Arabidopsis via theanalysis of 35S::G3449 Arabidopsis lines.

We have now generated 35S::G3449 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a slight reduction in overall size. Thesesimilarities in phenotypes suggest that the genes have similarfunctions. Additionally, 35S::G3449 transformants were distinctly palerthan wild-type at the seedling stage, perhaps indicating a reduction inthe levels of pigments such as anthocyanins.

TABLE 43 G3449 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 303 wt wt wt wt + + wt wt wt + 304wt wt wt wt wt wt wt wt wt wt 305 wt wt wt wt wt + wt wt wt + 306 wt wtwt wt wt wt wt wt wt + 307 wt wt wt wt wt wt wt wt wt + 309 wt wt wt wtwt wt wt wt wt + 310 wt wt wt wt wt wt wt wt wt + 311 wt wt wt wt wt +wt wt + + 312 wt wt wt wt wt wt wt wt wt + 313 + wt wt wt wt wt wt wt wt+Potential Applications

The results of the salt and cold germination assay confirm that G682 andits related sequences, including the soy G3449 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

The effect of G3449 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3449 plants could indicate that G3449 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Soy G3450 (SEQ ID NO: 1084, 550, 1076 and 2919)

G3450 is a soy gene that was identified as being a putative ortholog ofG682. Based on a phylogenetic tree built using conserved MYB domains,the G3450 protein appears to be more closely related to the G682-cladeof Arabidopsis genes than any of the other putative orthologs includedthe study. The aim of this project was to determine whether G3450 has anequivalent function to the G682-related genes from Arabidopsis via theanalysis of 35S::G3450 Arabidopsis lines.

We have now generated 35S::G3450 lines; these plants showed comparablemorphological effects to 35S::G682 lines and exhibited a glabrousphenotype combined with a slight reduction in overall size. Thesesimilarities in phenotypes suggest that the genes have similarfunctions. Interestingly, 35S::G3450 lines were slightly pale and someof the lines produced pale yellow seed, which likely indicated areduction in anthocyanin levels in the seed coat. Such an effect was notobserved in 35S::G682 seed, but G682 and its paralogs were found duringour genomics studies to inhibit anthocyanin production.

35S::G3450 lines have recently been tested in drought related assays;all of these lines exhibited an increase in root hair density and sevenof the ten lines showed an enhanced performance in one or more of theplate-based drought related stress assays.

The comparable morphological and physiological effects obtained in35S::G3450 lines versus overexpression lines for the G682-relatedArabidopsis genes, indicates that the G3450 protein has a very similaror equivalent activity to the Arabidopsis proteins

TABLE 44 G3450 35S, Direct promoter-fusion Germ in Germ. in Germ GermG682- high high in in Heat like root Line NaCl mannitol Sucrose ABA heatcold growth Drought Chilling morph. 301 wt wt wt wt wt wt wt wt wt + 302wt wt wt wt + + wt wt wt + 303 wt wt wt wt wt + wt wt + + 304 wt wt wtwt wt + + wt wt + 305 wt wt wt wt wt wt wt wt wt + 306 wt wt wt wt wt wtwt + wt + 307 wt wt wt wt wt + + wt + + 313 wt wt wt wt wt wt wt wt + +315 + wt wt wt wt + + + + + 317 + wt wt wt wt + wt wt + +Potential Applications

The results of the cold, heat cold and salt germination and growthassays confirm that G682 and its related sequences, including the soyG3450 sequence, are excellent candidates for improving abiotic stresstolerance in plants.

The effect of G3450 on epidermal patterning indicates that the genecould be applied to manipulate trichome development; in some speciestrichomes accumulate valuable secondary metabolites and in otherinstances are thought to provide protection against predation. Thelighter coloration of 35S::G3450 plants could indicate that G3450 mightbe used to regulate the production of flavonoid related compounds, whichcontribute to the nutritional value of foodstuffs.

Soy G3452 (SEQ ID NO: 1183, 1162 and 2924)

G3452 is a soy gene that was identified as being a putative ortholog ofG867. The aim of this project was to determine whether G3452 has anequivalent function to G867 via the analysis of 35S::G3452 Arabidopsislines.

35S::G3452 lines displayed a number of morphological similarities, suchas reduced size and alterations in coloration, to those seen in earlierstudies with 35S::G867 lines. A number of the 35S::G3452 lines were alsoslightly early flowering.

TABLE 45 G3452 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 304 + + + wt wt + + + 305 + wt + wt wt + wt wt 310 wtwt wt wt wt wt wt + 314 + wt + wt wt wt wt + 316 + wt + wt wt wt wt +318 wt wt + wt wt wt wt +Potential Applications

The results of these abiotic stress assays confirm that G867 and itsrelated sequences, including the soy G3452 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

The accelerated flowering seen in 35S::G3452 plants indicate that thegene could be used to manipulate flowering time. In particular,shortening generation times would also help speed-up breeding programs,particularly in species such as trees, which typically grow for manyyears before flowering. Conversely, it might be possible to modify theactivity of G3452 (or its orthologs) to delay flowering in order toachieve an increase in biomass and yield.

Soy G3465 (SEQ ID NO: 1206 and 1242)

G3465 is a soy gene that was identified as being a putative ortholog ofG912. On a phylogenetic tree, this gene appears to be more closelyrelated to G912 and CBF1-3, than to the two related Arabidopsis genes,G2107 and G2513. The aim of this project was to determine whether G3465has an equivalent function to G912 via the analysis of 35S::G3465Arabidopsis lines.

Overexpression of G3465 produced deleterious effects in Arabidopsis;35S::G3465 lines were small, dark in coloration and slow growing. Suchfeatures were comparable to, and possibly even more severe than thoseshown by 35S::G912 lines, indicating that the two genes likely have asimilar function. One overexpressing line was shown to have increasedgermination in high mannitol relative to wild-type control plants, andanother line was insensitive to ABA.

TABLE 46 G3465 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 321 wt wt wt wt wt 322 wt wt wt wt wt 323 wt wt wt wtwt 324 wt wt wt wt wt 341 wt wt wt wt wt 343 wt wt wt wt wt 344 wt wt wtwt wt 346 wt + wt wt wt 347 wt wt wt wt wt 348 wt wt wt + wtPotential Applications

Given the similar morphological effects of G3465 and G912overexpression, it is probable that the genes have comparable roles. Themorphological effects that were apparent in these lines suggest that theutilization of G3465 might benefit from optimization by use of differentpromoters or protein modifications.

The results of the mannitol and ABA germination assays confirm that G912and its related sequences, including the soy G3465 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

Soy G3469 (SEQ ID NO: 1210 and 1237)

G3469 is a soy gene that was identified as being a putative ortholog ofG912. Morphologically, the overexpressors of G3469 ranged from beingsomewhat small in size to plants with no consistent differences tocontrol.

Results of plate based assays show that G3469 is, like its orthologG912, is able to confer abiotic stress tolerance in plants, as indicatedby the greater tolerance than wild type control plants of G3469overexpressors to chilling conditions and germination in high salt.

TABLE 47 G3469 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 302 wt wt wt wt wt wt wt wt wt 303 + wt wt wt wt wt wtwt wt 305 + wt wt wt wt wt wt wt wt 309 wt wt wt wt wt wt wt wt wt 310 +wt wt wt wt wt wt wt wt 311 + wt wt wt wt wt wt wt wt 312 + wt wt wt wtwt wt wt wt 314 wt wt wt wt wt wt wt wt wt 315 wt wt wt wt wt wt wt wt +319 + wt wt wt wt wt wt + 304 wt wt wt wt wt wt wt wt + 307 wt wt wt wtwt wt wt wt + 317 wt wt wt wt wt wt wt wt + 318 wt wt wt wt wt wt wt +wtPotential Applications

The results of the salt, drought and chilling assays confirm that G912and its related sequences, including the soy G3469 sequence, areexcellent candidates for improving abiotic stress tolerance in plants.

Soy G3470 (SEQ ID NO: 2947 and 2948)

G3470 is ortholog of G481 and G482 from Glycine max. Among theArabidopsis paralogs in the G481 study group, G3470 is phylogeneticallymost closely related to G481 itself.

Experimental Observations

The aim of this study was to assess the role of G3470 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

Twenty 35S::G3470 lines were examined. Half of the transformantsexhibited a marked delay in flowering, of about 1 week, and had ratherdark narrowed leaves compared to wild-type or non-transformed controls.This same phenotype was noted for G481 and some of its other putativeorthologs.

35S::G3470 lines have now been tested in plate based physiology assays;seven of ten lines showed enhanced germination, relative to wild type,when tested in NaCl germination assays. Additionally, two 35S::G3470lines showed an enhanced performance in a heat growth assay.

TABLE 48 G3470 35S, Direct promoter-fusion Germ. Germ. in Germ. in highGerm. in Growth Line high NaCl mannitol Sucrose ABA in heat cold in heatDrought Chilling 301 wt wt wt wt wt wt + wt wt 302 + wt wt wt wt wt wtwt wt 303 + wt wt wt wt wt + wt wt 305 + wt wt wt wt wt wt wt wt 306 wtwt wt wt wt wt wt wt wt 309 + wt wt wt wt wt wt wt wt 310 + wt wt wt wtwt wt wt wt 316 + wt wt wt wt wt wt wt wt 317 wt wt wt wt wt wt wt wt wt318 + wt wt wt wt wt wt wt wtPotential Applications

The results of the salt germination assay confirm that G481 and itsrelated sequences, including the soy G3470 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Soy G3471 (SEQ ID NO: 2949 and 2950)

G3471 from Glycine max is an ortholog of G481 and G482. Among theArabidopsis paralogs in the G481 study group, G3471 is phylogeneticallymost closely related to G481.

Experimental Observations

The aim of this study was to assess the role of G3471 in droughtstress-related tolerance via overexpression, and compare the effectswith that of the other G481 orthologs and paralogs.

Changes in flowering time were seen among the 35S::G3471 lines. A numberof lines appeared late flowering, while others showed a marginalacceleration of flowering. Some of the 35S::G3471 lines also showedalterations in leaf shape.

Several lines performed better than controls in abiotic stress assays,including germination in heat, growth in cold conditions, and betterdrought tolerance in a plate-based assay.

TABLE 49 G3471 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 303 wt wt wt wt + wt wt wt + 306 wt wt wt wt wt wt wtwt wt 307 wt wt wt wt + wt wt wt wt 308 wt wt wt wt wt wt wt + wt 312 wtwt wt wt + wt wt wt wt 328 wt wt wt wt wt wt wt wt wt 329 wt wt wt wt wtwt wt wt 330 wt wt wt wt wt wt wt wt 337 wt wt wt wt wt wt wt wt 338 wtwt wt wt wt wt wt wt wtPotential Applications

The results of the salt germination assay confirm that G481 and itsrelated sequences, including the soy G3472 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Potential Applications

The results of the salt germination assay confirm that G481 and itsrelated sequences, including the soy G3472 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Soy G3472 (SEQ ID NO: 801 and 2907)

G3472 is an ortholog of G481 and G482 from Glycine max, and is a memberof the HAP3 subgroup of the CCAAT-box binding transcription factorfamily. Among the Arabidopsis paralogs in the G481 study group, thisgene is phylogenetically most closely related to G485.

Experimental Observations

The aim of this study was to assess the role of G3472 in droughtstress-related tolerance via overexpression. 35S::G3472 lines showed noconsistent differences in morphology to wild-type or non-transformedcontrols. G3472 did not produce accelerated flowering in the same manneras did other G485 related genes such as G3474, G3475 and G3476.

Three 35S::G3472 lines were more salt tolerant than wild-type ornon-transformed controls in a plate-based assay.

TABLE 50 G3472 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 303 wt wt wt wt wt wt wt wt wt 304 wt wt wt wt wt wt wtwt wt 305 wt wt wt wt wt wt wt wt wt 306 wt wt wt wt wt wt wt wt wt 307wt wt wt wt wt wt wt wt wt 308 + wt wt wt wt wt wt wt wt 311 + wt wt wtwt wt wt wt wt 313 wt wt wt wt wt wt wt wt wt 314 wt wt wt wt wt wt wtwt wt 318 + wt wt wt wt wt wt wt −Potential Applications

The results of the salt germination assay confirm that G481 and itsrelated sequences, including the soy G3472 sequence, are excellentcandidates for improving abiotic stress tolerance in plants.

Soy G3475 (SEQ ID NO: 800 and 2908)

G3475 is from Glycine max and is a putative ortholog of G481, and is amember of the HAP3 subgroup of the CCAAT-box binding transcriptionfactor family. Among the Arabidopsis paralogs in the G481 study group,this gene is phylogenetically most closely related to G485.

Experimental Observations

35S::G3475 lines showed accelerated flowering by about one to two weekscompared to wild-type or non-transformed controls. This same phenotypewas also noted for the most closely related Arabidopsis gene, G485. Manyof these early flowering lines also were smaller than controls.

Growth of four 35S::G3475 lines was more tolerant to cold conditionsthan wild-type or non-transformed controls.

TABLE 51 G3475 35S, Direct promoter-fusion Germ. in Germ. in high Germ.Germ. Growth Line high NaCl mannitol Sucrose ABA in heat in cold in heatDrought Chilling 301 wt wt wt wt wt wt wt wt wt 302 wt wt wt wt wt wt wtwt + 303 wt wt wt wt wt wt wt wt wt 304 wt wt wt wt wt wt wt wt + 306 wtwt wt wt wt wt wt wt + 307 wt wt wt wt wt wt wt wt + 308 wt wt wt wt wtwt wt wt wt 309 wt wt wt wt wt wt wt wt wt 310 wt wt wt wt wt wt wt wtwt 311 wt wt wt wt wt wt wt wt wtPotential Applications

The results of the chilling assay confirm that G481 and its relatedsequences, including the soy G3475 sequence, are excellent candidatesfor improving abiotic stress tolerance in plants.

Example XIV Transformation of Cereal Plants with an Expression Vector

Cereal plants such as, but not limited to, corn, wheat, rice, sorghum,or barley, may also be transformed with the present polynucleotidesequences in pMEN20 or pMEN65 expression vectors for the purpose ofmodifying plant traits. For example, pMEN020 may be modified to replacethe NptII coding region with the BAR gene of Streptomyces hygroscopicusthat confers resistance to phosphinothricin. The KpnI and BglII sites ofthe Bar gene are removed by site-directed mutagenesis with silent codonchanges.

The cloning vector may be introduced into a variety of cereal plants bymeans well known in the art such as, for example, direct DNA transfer orAgrobacterium tumefaciens-mediated transformation. It is now routine toproduce transgenic plants of most cereal crops (Vasil (1994) Plant Mol.Biol. 25: 925-937) such as corn, wheat, rice, sorghum (Cassas et al.(1993) Proc. Natl. Acad. Sci. 90: 11212-11216, and barley (Wan andLemeaux (1994) Plant Physiol. 104:37-48. DNA transfer methods such asthe microprojectile can be used for corn (Fromm et al. (1990)Bio/Technol. 8: 833-839); Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Ishida (1990) Nature Biotechnol. 14:745-750), wheat (Vasil etal. (1992) Bio/Technol. 10:667-674; Vasil et al. (1993) Bio/Technol.11:1553-1558; Weeks et al. (1993) Plant Physiol. 102:1077-1084), rice(Christou (1991) Bio/Technol. 9:957-962; Hiei et al. (1994) Plant J.6:271-282; Aldemita and Hodges (1996) Planta 199:612-617; and Hiei etal. (1997) Plant Mol. Biol. 35:205-218). For most cereal plants,embryogenic cells derived from immature scutellum tissues are thepreferred cellular targets for transformation (Hiei et al. (1997) PlantMol. Biol. 35:205-218; Vasil (1994) Plant Mol. Biol. 25: 925-937).

Vectors according to the present invention may be transformed into cornembryogenic cells derived from immature scutellar tissue by usingmicroprojectile bombardment, with the A188XB73 genotype as the preferredgenotype (Fromm et al. (1990) Bio/Technol. 8: 833-839; Gordon-Kamm etal. (1990) Plant Cell 2: 603-618). After microprojectile bombardment thetissues are selected on phosphinothricin to identify the transgenicembryogenic cells (Gordon-Kamm et al. (1990) Plant Cell 2: 603-618).Transgenic plants are regenerated by standard corn regenerationtechniques (Fromm et al. (1990) Bio/Technol. 8: 833-839; Gordon-Kamm etal. (1990) Plant Cell 2: 603-618).

The plasmids prepared as described above can also be used to producetransgenic wheat and rice plants (Christou (1991) Bio/Technol.9:957-962; Hiei et al. (1994) Plant J. 6:271-282; Aldemita and Hodges(1996) Planta 199:612-617; and Hiei et al. (1997) Plant Mol. Biol.35:205-218) that coordinately express genes of interest by followingstandard transformation protocols known to those skilled in the art forrice and wheat (Vasil et al. (1992) Bio/Technol. 10:667-674; Vasil etal. (1993) Bio/Technol. 11:1553-1558; and Weeks et al. (1993) PlantPhysiol. 102:1077-1084), where the bar gene is used as the selectablemarker.

Example XV Transformation of Canola with a Plasmid Containing CBF1,CBF2, or CBF3

After identifying homologous genes to CBF1, canola was transformed witha plasmid containing the Arabidopsis CBF1, CBF2, or CBF3 genes clonedinto the vector pGA643 (An (1987) Methods Enzymol. 253: 292). In theseconstructs the CBF genes were expressed constitutively under the CaMV35S promoter. In addition, the CBF1 gene was cloned under the control ofthe Arabidopsis COR15 promoter in the same vector pGA643. Each constructwas transformed into Agrobacterium strain GV3101. TransformedAgrobacteria were grown for 2 days in minimal AB medium containingappropriate antibiotics.

Spring canola (B. napus cv. Westar) was transformed using the protocolof Moloney et al. (1989) Plant Cell Reports 8:238, with somemodifications as described. Briefly, seeds were sterilized and plated onhalf strength MS medium, containing 1% sucrose. Plates were incubated at24° C. under 60-80 μE/m²s light using a 16 hour light/8 hour darkphotoperiod. Cotyledons from 4-5 day old seedlings were collected, thepetioles cut and dipped into the Agrobacterium solution. The dippedcotyledons were placed on co-cultivation medium at a density of 20cotyledons/plate and incubated as described above for 3 days. Explantswere transferred to the same media, but containing 300 mg/l timentin(SmithKline Beecham, Pa.) and thinned to 10 cotyledons/plate. After 7days explants were transferred to Selection/Regeneration medium.Transfers were continued every 2-3 weeks (2 or 3 times) until shoots haddeveloped. Shoots were transferred to Shoot-Elongation medium every 2-3weeks. Healthy looking shoots were transferred to rooting medium. Oncegood roots had developed, the plants were placed into moist pottingsoil.

The transformed plants were then analyzed for the presence of the NPTIIgene/kanamycin resistance by ELISA, using the ELISA NPTII kit from5Prime-3Prime Inc. (Boulder, Colo.). Approximately 70% of the screenedplants were NPTII positive. Only those plants were further analyzed.

From Northern blot analysis of the plants that were transformed with theconstitutively expressing constructs, showed expression of the CBF genesand all CBF genes were capable of inducing the Brassica napuscold-regulated gene BN115 (homolog of the Arabidopsis COR15 gene). Mostof the transgenic plants appear to exhibit a normal growth phenotype. Asexpected, the transgenic plants are more freezing tolerant than thewild-type plants. Using the electrolyte leakage of leaves test, thecontrol showed a 50% leakage at −2 to −3° C. Spring canola transformedwith either CBF1 or CBF2 showed a 50% leakage at −6 to −7° C. Springcanola transformed with CBF3 shows a 50% leakage at about −10 to −15° C.Winter canola transformed with CBF3 may show a 50% leakage at about −16to −20° C. Furthermore, if the spring or winter canola are coldacclimated the transformed plants may exhibit a further increase infreezing tolerance of at least −2° C.

To test salinity tolerance of the transformed plants, plants werewatered with 150 mM NaCl. Plants overexpressing CBF1, CBF2 or CBF3 grewbetter compared with plants that had not been transformed with CBF1,CBF2 or CBF3.

These results demonstrate that equivalogs of Arabidopsis transcriptionfactors can be identified and shown to confer similar functions innon-Arabidopsis plant species.

Example XVI Cloning of Transcription Factor Promoters

Promoters are isolated from transcription factor genes that have geneexpression patterns useful for a range of applications, as determined bymethods well known in the art (including transcript profile analysiswith cDNA or oligonucleotide microarrays, Northern blot analysis,semi-quantitative or quantitative RT-PCR). Interesting gene expressionprofiles are revealed by determining transcript abundance for a selectedtranscription factor gene after exposure of plants to a range ofdifferent experimental conditions, and in a range of different tissue ororgan types, or developmental stages. Experimental conditions to whichplants are exposed for this purpose includes cold, heat, drought,osmotic challenge, varied hormone concentrations (ABA, GA, auxin,cytokinin, salicylic acid, brassinosteroid), pathogen and pestchallenge. The tissue types and developmental stages include stem, root,flower, rosette leaves, cauline leaves, siliques, germinating seed, andmeristematic tissue. The set of expression levels provides a patternthat is determined by the regulatory elements of the gene promoter.

Transcription factor promoters for the genes disclosed herein areobtained by cloning 1.5 kb to 2.0 kb of genomic sequence immediatelyupstream of the translation start codon for the coding sequence of theencoded transcription factor protein. This region includes the 5′-UTR ofthe transcription factor gene, which can comprise regulatory elements.The 1.5 kb to 2.0 kb region is cloned through PCR methods, using primersthat include one in the 3′ direction located at the translation startcodon (including appropriate adaptor sequence), and one in the 5′direction located from 1.5 kb to 2.0 kb upstream of the translationstart codon (including appropriate adaptor sequence). The desiredfragments are PCR-amplified from Arabidopsis Col-0 genomic DNA usinghigh-fidelity Taq DNA polymerase to minimize the incorporation of pointmutation(s). The cloning primers incorporate two rare restriction sites,such as NotI and SfiI, found at low frequency throughout the Arabidopsisgenome. Additional restriction sites are used in the instances where aNotI or SfiI restriction site is present within the promoter.

The 1.5-2.0 kb fragment upstream from the translation start codon,including the 5′-untranslated region of the transcription factor, iscloned in a binary transformation vector immediately upstream of asuitable reporter gene, or a transactivator gene that is capable ofprogramming expression of a reporter gene in a second gene construct.Reporter genes used include green fluorescent protein (and relatedfluorescent protein color variants), beta-glucuronidase, and luciferase.Suitable transactivator genes include LexA-GAL4, along with atransactivatable reporter in a second binary plasmid (as disclosed inU.S. patent application Ser. No. 09/958,131, incorporated herein byreference). The binary plasmid(s) is transferred into Agrobacterium andthe structure of the plasmid confirmed by PCR. These strains areintroduced into Arabidopsis plants as described in other examples, andgene expression patterns determined according to standard methods knowto one skilled in the art for monitoring GFP fluorescence,beta-glucuronidase activity, or luminescence.

All publications and patent applications mentioned in this specificationare herein incorporated by reference to the same extent as if eachindividual publication or patent application was specifically andindividually indicated to be incorporated by reference.

The present invention is not limited by the specific embodimentsdescribed herein. The invention now being fully described, it will beapparent to one of ordinary skill in the art that many changes andmodifications can be made thereto without departing from the spirit orscope of the appended claims. Modifications that become apparent fromthe foregoing description and accompanying figures fall within the scopeof the claims.

1. A transgenic plant having greater tolerance to cold or sucrose ascompared to a non-transgenic or wild-type control plant of the samespecies; wherein the transgenic plant comprises in its genome arecombinant polynucleotide encoding a polypeptide member of theMYB-related transcription factor family; wherein overexpression of thepolypeptide member confers said greater tolerance to cold or sucrose;and wherein the polypeptide member of the MYB-related transcriptionfactor family comprises a MYB-related domain that shares an amino acididentity of at least 80% with amino acids 28-64 of SEQ ID NO:
 1082. 2.The transgenic plant of claim 1, wherein MYB-related domain binds to DNAat a transcription regulating region; wherein said binding regulatestranscription of the DNA; and said regulation of transcription confersthe greater tolerance to cold or sucrose in the transgenic plant ascompared to the control plant.
 3. The transgenic plant of claim 1,wherein the recombinant polynucleotide is derived from amonocotyledonous plant.
 4. The transgenic plant of claim 1, wherein theMYB-related domain is at least 95% identical to amino acids 28-64 of SEQID NO:
 1082. 5. The transgenic plant of claim 1, wherein the polypeptidemember of the MYB-related transcription factor family comprises SEQ IDNO:
 1082. 6. The transgenic plant of claim 1, wherein the transgenicplant is more tolerant to 8° C. or 9.4% sucrose.
 7. The transgenic plantof claim 1, wherein the recombinant polynucleotide further comprises aconstitutive, inducible, or tissue-specific promoter that regulatesexpression of the polypeptide member of the MYB-related transcriptionfactor family.
 8. The transgenic plant of claim 1, wherein therecombinant polynucleotide is incorporated into an expression vectorcomprising one or more regulatory elements that regulate expression ofthe polypeptide member of the MYB-related transcription factor family.9. The transgenic plant of claim 1, wherein the transgenic plant is acultured host cell.
 10. A transformed seed from the transgenic plantaccording to claim
 1. 11. A method for producing a transgenic planthaving greater tolerance to cold or sucrose as compared to anon-transgenic or wild-type control plant of the same species, themethod steps comprising: (a) providing an expression vector comprising:(i) a polynucleotide sequence comprising a nucleotide sequence encodinga polypeptide member of the MYB-related transcription factor family;wherein said polypeptide member of the MYB-related transcription factorfamily comprises a MYB-related domain that shares an amino acid identityof at least 80% with amino acids 28-64 of SEQ ID NO: 1082 and regulatestranscription of DNA, and said regulation of transcription confersgreater tolerance to cold or sucrose in the transgenic plant as comparedto the non-transgenic or wild-type control plant; and (ii) a regulatoryelement operably linked to the nucleotide sequence, said regulatoryelement controlling expression of said nucleotide sequence in thetransgenic plant; (b) introducing the expression vector into a plantcell; and (c) growing the plant cell into the transgenic plant.
 12. Themethod of claim 11, the method steps further comprising: (d) crossingthe transgenic plant with itself or another plant; (e) selectingtransformed seed that develops as a result of said crossing; and (f)growing a progeny plant from the transformed seed, thus producing atransgenic progeny plant having greater tolerance to cold or sucrose ascompared to the non-transgenic or wild-type control plant of the samespecies.
 13. The method of claim 12, wherein: said transgenic progenyplant expresses mRNA that encodes a DNA-binding protein that binds to aDNA regulatory sequence and induces expression of a plant trait gene;and said mRNA is expressed at a level greater than in the non-transgenicor wild-type control plant.
 14. The method of claim 11, wherein theMYB-related domain shares an amino acid identity of at least 95% withamino acids 28-64 of SEQ ID NO:
 1082. 15. The method of claim 11,wherein the polypeptide member of the MYB-related transcription factorfamily comprises SEQ ID NO:
 1082. 16. A transgenic plant comprising inits genome a recombinant polynucleotide encoding a polypeptide member ofthe MYB-related transcription factor family, wherein the polypeptidemember comprises SEQ ID NO:
 1082. 17. A method for producing atransgenic plant, the method steps comprising: (a) providing anexpression vector comprising: (i) a polynucleotide sequence encoding apolypeptide member of the MYB-related transcription factor family,wherein the polypeptide member comprises SEQ ID NO: 1082; and (ii) aregulatory element operably linked to the nucleotide sequence, saidregulatory element controlling expression of said nucleotide sequence inthe transgenic plant; (b) introducing the expression vector into a plantcell; and (c) growing the plant cell into the transgenic plant.